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Preface 


This  report  represents  the  third  year  of  research  performed  under  the 
auspices  of  the  Joint  Services  Electronics  Program  at  Texas  Tech  University. 

The  program  is  concentrated  in  the  "information  electronics"  area  and  includes 
researchers  from  both  the  departments  of  Electrical  Engineering  and  Mathematics. 
Specific  work  units  deal  with  Quadratic  Optimization  Problems,  Nonlinear  Con¬ 
trol,  Nonlinear  Fault  Analysis,  the  Qualitative  Analysis  of  Large-Scale  Systems, 
Multidimensional  System  Theory,  Optical  Noise,  and  Pattern  Recognition. 

Each  work  unit  is  represented  in  the  report  by  a  summary  of  the  work  per¬ 
formed  during  the  past  year,  a  list  of  publications  and  activities  in  the  area, 
reprints  of  all  papers  which  have  been  published  during  the  past  year,  and 
abstracts  of  pending  papers.  In  addition  the  report  includes  a  list  of  all 
grants  and  contracts  administered  by  JSEP  personnel  and/or  the  department  of 
Electrical  Engineering  and  a  list  of  all  publications  prepared  by  JSEP  personnel. 
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Significant  Accomplishments  Report 
A.  Feedback  System  Design 

The  feedback  system  design  problem  may  naturally  be  subdivided  into 
two  tasks: 

i.  satisfaction  of  design  constraints  and 
ii.  optimization  of  system  performance. 

The  first  and  foremost  design  constraint  is  stability,  though  system  specifi¬ 
cations  may  also  call  for  an  asymptotic  tracking  and/or  disturbance  rejection 
constraint.  During  the  past  year;  working  with  a  team  of  investigators  from  - 
the  University  of  Notre  Dame,  the  University  of  California,  and  Texas  Tech; 
we  have  formulated  a  new  algebraic  fractional  representation  approach  to  the 
feedback  system  design  problem.  Unlike  classical  design  theories  wherein  a 
single- solution  to  the  given  design  problem  is  formulated,  the  key  to  our 
approach  is  a  parameterization  of  the  set  of  compensators  which  achieve  the 
design  constraints.  As  such,  the  design  constraints  of  task  i.  are  satisfied 
and  the  stage  is  simultaneously  set  for  the  optimization  problem  of  task  ii. 

The  design  theory  is  formulated  in  a  very  general  algebraic  setting  and 
is  therefore  applicable  to  any  class  of  linear  systems;  distributed,  multi- 
variable,  time-varying,  multidimensional,  etc.  Moreover,  we  believe  that  the 
techniques  developed  can  potentially  form  the  basis  of  an  entire  family  of 
design  techniques  for  adaptive  and  robust  control  system.  The  initial  part  of 
this  research  is  described  in  a  paper  which  will  appear  in  a  forthcoming  issue 
of  the  IEEE  Transactions  on  Automatic  Control  while  a  second  paper  is  in  prep¬ 
aration.  Furthermore,  we  have  initiated  work  on  the  robust  and  adaptive  control 
problems  for  which  an  M.S.  thesis  is  presently  in  preparation. 


1 


C 


B .  Po i ntinq  and  Trackin' 
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During  the  summer  of  1979  Professor  T.G.  Newman,  while  working  on  his 
JSEP  project  at  the  White  Sands  Missile  Range,  developed  an  entirely  new 
approach  to  the  problem  of  identifying  multiple  moving  targets  in  a  scene. 

The  target  motion  is  assumed  to  be  modeled  by  the  elements  of  a  Lie  group 
(of  translations,  rotations,  magnifications,  etc.)  and  the  key  to  the  theory 
is  the  formulation  of  an  equation  of  motion  in  the  coordinate  system  of  the 
Lie  group  rather  than  in  Euclidian  coordinates.  In  this  coordinate  system 
every  point  of  a  rigid  body  is  moving  at  exactly  the  same  speed.  As  such, 
if  one  numerically  computes  a  velocity  profile  from  photographic  data,  the 
resultant  profile  will  be  piecewise  constant  with  distinct  levels  correspond¬ 
ing  to  distinct  objects. 

Although  formulated  in  a  Lie  group  the  required  equation  of  motion  can 
be  represented  by  a  nonlinear  partial  differential  equation  in  Euclidian  space, 
thereby,  permitting  the  theory  to  be  implemented  via  standard  numerical  tech¬ 
niques.  Indeed,  Newman  has  experimentally  implemented  the  theory  using  actual 
photographic  tracking  data  taken  at  White  Sands.  In  particular,  he  success¬ 
fully  applied  the  algorithm  to  an  extremely  noisy  sequence  of  photographs  of 
an  aircraft  moving  in  front  of  a  mountain  which  had  been  the  subject  of  several 
previous  unsuccessful  attempts  at  analysis. 

C.  Eigenvalue  Computation 

In  large-scale  system  theory,  one  often  encounters  the  problem  of  computing 
the  eigenvalues  for  a  continuously  parameterized  family  of  large  sparse  matrices. 

As  an  alternative  to  the  classical  approach  of  discretizing  the  parameter  and 
using  a  standard  eigenvalue  code  at  each  parameter  value  we  have  developed  a 
new  continuations  algorithm  for  eigenvalue  computation.  Basically,  one 
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formulates  a  nonlinear  ordinary  differential  equation  whose  trajectories 
represent  the  eigenvalue  loci  of  the  given  family  of  matrices.  One  then 
computes  the  eigenvalues  for  an  initial  matrix,  using  a  standard  algorithm, 
and  uses  these  eigenvalues  as  initial  conditions  for  the  differential  equation 
in  a  numerical  integration  scheme  to  compute  the  eigenvalues  of  the  remaining 
matrices  in  the  family. 

The  key  to  our  continuations  algorithm  is  the  formulation  of  the  required 
differential  equation,  so  as  to  minimize  the  computational  effort  required  to 
implement  the  numerical  integration  scheme.  To  achieve  this  goal  one  must  ex¬ 
ploit  the  sparseness  of  the  given  family  of  matrices  and  simultaneously  mini¬ 
mize  the  number  of  matrix  inversions  employed.  With  these  points  in  mind  we 
have  formulated  and  tested  three  alternative  continuations  algorithms  for  the 
solution  of  the  eigenvalue  problem;  one  based  on  the  LU  algorithm,  one  based 
on  the  QR  algorithm,  and  one  which  employs  a  Hessenberg  form. 


Texas  Tech  University 

Joint  Services  Electronics  Program 


Institute  for  Electronic  Sci ence 


Research  Unit:  1 

1.  Title  of  Investigation:  Quadratic  Optimization  Problems 

2.  Senior  Investigator:  Richard  Saeks  Telephone:  (806)  742-3528 

3.  JSEP  Funds:  $23,500 

4.  Other  Funds: 

5.  Total  Number  of  Professionals:  Pi's  1  (1  mo.  )  RA's  1  (1/2  time) 

6.  Summary: 

The  goal  of  the  work  unit  is  the  development  of  techniques  for  the  design 
of  modern  robust,  adaptive,  and  decentralized  control  systems.  To  this  end 
a  powerful  quadratic  optimization  theory  previously  developed  by  the  author 
will  be  employed  along  with  a  new  feedback  system  design  theory  developed  by 
the  senior  investigator  and  several  colleagues  under  the  present  work  unit. 

By  combining  these  techniques  we  have  developed  a  new  quadratic  optimal  control 
theory  for  feedback  systems  with  unstable  plants.  Moreover,  a  modified  version 
of  this  theory  has  been  developed  in  which  one  includes  an  additional  term  in 
the  performance  measure  to  reduce  the  sensitivity  of  the  system  to  plant  per¬ 
turbations.  As  such,  by  controlling  the  weight  of  this  term,  one  may  obtain  a 
tradeoff  between  system  performance  and  robustness.  In  another  direction  we 
have  developed  a  new  approach  to  suboptimal  control  theory  which  is  capable  of 
handling  systems  with  non-quadratic  performance  measures,  decentralized  systems, 
and  nonlinear  systems. 

The  major  result  obtained  during  the  past  year  has  been  the  formulation  of 
a  new  feedback  system  design  theory  in  which  we  give  an  explicit  parameterization 
of  the  class  of  all  possible  compensators  which  stabilize  a  given  feedback 

system.  Moreover,  the  resultant  feedback  system  gains  are  linear  in  the  design 
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parameter.  As  such,  by  working  with  this  parameterization,  we  characterize 
all  compensators  which  achieve  the  stability  constraint  while  simultaneously 
simplifying  the  process  of  choosing  a  compensator  within  this  class.  A  paper 
describing  this  work  has  been  accepted  for  publication  in  the  IEEE  Transactions 
on  Automatic  Control.  At  the  present  time  we  are  in  the  process  of  extending 
this  theory  to  obtain  a  similar  characterization  of  the  compensators  which 
stabilize  a  given  plant  and  simultaneously  cause  it  to  track  and/or  reject 
prescribed  inputs. 

In  parallel  with  the  above  described  work  we  have  developed  a  new  approach 
to  sub-optimal  control  theory  which  is  applicable  to  non-quadratic,  nonlinear, 
and  decentralized  control  problems.  Basically,  we  approximate  the  given 
problem  by  a  linear  quadratic  problem  and  compute  the  classical  linear  regula¬ 
tor  for  this  problem,  relative  to  a  specified  set  of  weighting  matrices.  This 
regulator  is  then  used  in  the  actual  system  with  its  performance  measure  being 
minimized  over  the  choice  of  weighting  matrices  used  to  construct  the  linear 
quadratic  regulator.  A  reprint  of  a  conference  paper  in  this  area  is  included 
in  the  present  report.  This  describes  the  general  technique  and  its  application 
to  the  design  of  an  aircraft  landing  system. 

A  final  aspect  of  the  work  is  in  the  decentralized  control  area  and  is 
represented  by  a  reprint  of  a  paper  recently  published  in  the  IEEE  Transactions 
on  Automatic  Control.  This  paper  proves  a  surprising  theorem  to  the  effect  that 
decentralized  control  is  just  as  powerful  as  decentralized  control  from  the 
point  of  view  of  pole  placement  (but  not  optimization)  in  an  interconnected 
dynamical  system. 

7.  Publications  and  Activities: 

A.  Refereed  Journal  Articles 
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1.  Saeks,  R.,  "On  the  Decentralized  Control  of  Interconnected 
Dynamical  Systems",  IEEE  Trans,  on  Auto.  Control,  Vol ,  AC-24, 
pp.  269-271,  (1979). 

2.  Desoer,  C.A.,  Liu,  R.-W.,  Murray,  J.,  and  R.  Saeks,  "Feedback 
System  Design:  The  Fractional  Representation  Approach  to 
Analysis  and  Synthesis",  IEEE  Trans,  on  Auto.  Cont.,  (to  appear). 

B.  Conference  Paoers  and  Abstracts 

1.  Karmokilias,  C.,  and  R.  Saeks,  "Optimal  Selection  of  Weighting 
Matrices  in  Kalman  Regulators",  Proc.  of  the  21st  Midwest  Symp. 
on  Circuits  and  Systems,  Iowa  State  Univ.,  Ames,  la.,  Aug.  1978, 
pp.  71-72. 

C.  Preprints 

1.  Karmokol ias ,  C.,  and  R.  Saeks,  "Suboptimal  Control  with  Optimal 
Quadratic  Regulators",  submitted  for  publication. 

2.  Karmokol ias,  C.,  and  R.  Saeks,  "Suboptimal  Design  of  an  Aircraft 
Landing  System",  submitted  fo*'  publication. 

D.  Theses 

1.  Karmokol ias,  C.,  "Suboptimal  Control  with  Optimal  Quadratic 
Regulators",  Ph.D.  Dissertation,  Texas  Tech  Univ.,  1979. 

2.  Chua,  0.,  M.S.  Thesis,  Texas  Tech  Univ.,  (in  preparation). 

E.  Conferences  and  Symposia 

1.  Saeks,  R.,  21st  Midwest  Symp.  on  Circuits  and  Systems,  Iowa  State 
Univ. ,  Aug.  1978. 

2.  Karmokolias,  C.,  21st  Midwest  Symp.  on  Circuits  and  Systems,  Iowa 
State  Uni v. ,  Aug.  1978. 

3.  Saeks,  R.,  IEEE  Decision  and  Control  Conf.,  San  Diego,  Jan.  1979. 

F.  Lectures 

1.  Saeks,  R.,  "Feedback  System  Design",  Elec.  Engrg.  Colloquim,  Univ. 
of  Texas  at  El  Paso,  March  1979. 

2.  Saeks,  R.,  "Feedback  System  Design",  Texas  Systems  Workshop, 
Southern  Methodist  Univ.,  April  1979. 
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8.  Reprint  of  "On  the  Decentralized  Control  of  Interconnected  Dynamic  Systems", 
by  R.  Saeks  from  the  IEEE  Transactions  on  Automatic  Control, 

Vol .  AC-24,  pp.  269-271,  (1979). 


On  the  Decentralized  Control  of  Interconnected 
Dynamical  Systems 

R.  SAEKS,  FELLOW,  IEEE 

Abstract — It  b  shown  tint  the  fixed  modes  of  u  Interconnected  dy¬ 
namical  system  under  decentralized  coatroi  are  predsely  the  uncontrolla¬ 
ble  and  unobservable  states  of  the  individual  system  components.  As  such, 
the  system  can  be  stabilised  by  decentralized  controllers  if  and  only  if  Its 
Individual  system  components  can  be  stabilized.  Moreover,  these  condi¬ 
tions  are  shown  to  be  equivalent  to  the  cooddoos  for  «»MiMng  the  system 
using  a  global  controller. 


InTXODUCTION 

Given  a  linear  system  with  partitioned  inputs  and  outputs 
X-FX  +  2  B‘u , 

i- 1 

y,-C'X  (I) 

it  is  desired  to  design  a  family  of  dynamic  decentralized  controllers 


Z,  —  SjZj  +  Rjfj 

“,-Q,Z,  +  Ky, 


(2) 


which  place  the  poles  (eigenvalues)  of  the  resultant  feedback  system  m 
prescribed  locauons.  In  its  most  general  form  the  solution  to  this 
problem  vu  given  by  Wang  and  Davison  (1).  Their  solution  is  for¬ 
mulated  in  terms  of  the  (diagonally)  fixed  modes  of  the  system 

BJ,F.B,C)-n\(F+BKdC).  (3) 

Here,  B  and  C  are  the  matrices  B-rotn(B')  and  C-col(C’),  respec¬ 
tively,  \(M)  denote  the  set  of  eigenvalue  of  the  matrix  M,  and  the 
intersection  is  lake  over  the  set  of  block  diagonal  (complex')  matrices  K4 


Manuscript  received  February  23.  IfTI;  revised  October  16,  I9?t.  Paper  reoommmoed 
by  D  SiUak.  Chairman  of  the  Large  Scale  System*.  Differential  Oaaaee  Committee.  The 
•orfc  wm*  supported  m  part  by  the  Jotat  Service*  Elmt.-'otuc*  Program  at  Texas  Tech 
Uniwerwry  under  ONR  Contract  76-C-II36 

The  author  ia  with  the  Department  of  OactncaJ  Eagmeenag.  Teaaa  Tech  Uniwnty. 
uubbock.  TX  7*a09 

'Pramaaly  the  tame  theory  can  be  formulated  for  systems  chamctaruad  by  real 
aatnoaa.  although  in  that  caaa  the  argument*  are  complicated  by  the  fact  that  one  must 
work  with  pair*  of  complex  conjugate  eigenvalues  to  preeerva  reality. 
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whose  partition  is  comformabte  with  the  partitions  of  B  and  C.  Using 
this  concept  of  fixed  modes,  Wang  and  Davison  (1)  and  Corf  mat  and 
Morse  [2]  showed  that  the  eigenvalues  of  the  system  can  be  placed  in  a 
prespecified  open  region  of  the  complex  plane  using  the  dynamic  de¬ 
centralized  controllers  of  (2)  if  and  only  if  9d  lies  in  that  region.  More 
precisely,  they  showed  that  9d  represents  the  set  of  eigenvalues  of  F 
which  cannot  be  moved  by  any  family  of  decentralized  dynamic  con¬ 
trollers.  while  all  remaining  eigenvalues  of  F  can  be  arbitrarily  placed  by 
an  appropriate  choice  of  decentralized  dynamic  controllers  [2], 

If  one  observes  that  F- *■  BKdC  is  just  the  state  matrix  for  the  given 
system  with  the  static  decentralized  feedback  matrix  Kd,  the  above  result 
can  be  interpreted  as  a  characterization  of  the  eigenvalue  placement 
properties  of  the  system  under  dynamic  decentralized  control  in  terms  of 
its  eigenvalue  placement  properties  under  static  decentralized  control. 
Indeed,  the  theory  states  that  those  eigenvalues  which  can  be  moved  at 
all  by  static  controllers  can  be  arbitrarily  placed  by  dynamic  controllers, 
whereas  those  eigenvalues  which  are  fixed  under  all  static  controllers  are 
also  fixed  by  dynamic  controllers  [1],  [2]. 

Since  the  parutioning  in  (I)  is  arbitrary,  the  above-described  theorem 
can  be  applied  to  the  classical  case  wherein  the  given  system  has  only  a 
single  input  and  output,  in  which  case  the  fixed  modes  of  the  system  are 
given  by 

9(F.3.C)-n\(F+BKC)  (4) 

where  the  intersection  is  now  taken  over  arbitrary  matrices  K  which  are 
conformable  with  B  and  C.  Of  course,  in  this  special  case  9{F.B.C) 
reduces  to  the  usual  set  of  eigenvalues  which  are  either  uncontrollable  or 
unobservable  [4|.  Moreover. 

9{F.B.C)c9d(F.B,C)  (5) 

since  the  intersection  used  to  define  9  is  taken  over  a  larger  set  of 
matrices  than  that  used  to  define  9d.  Equation  (3)  formalizes  the  intui¬ 
tively  obvious  fact  that  a  system  which  is  “stabtlizaole"  by  a  family  of 
decentralized  controllers  is  also  “stabiiizable”  by  a  global  (centralized) 
controller. 

The  purpose  of  the  present  paper  is  to  show  that  (3)  holds  with 
equality  in  the  case  where  F.  B.  and  C  represent  the  dynamics  of  an 
interconnected  dynamical  system  [4]  m  which  y,  and  u,  denote  the  local 
inputs  and  outputs  associated  with  a  given  system  component  As  such, 
in  that  special  case  the  eigenvalues  of  the  system  can  be  placed  in 
prespecified  locauons  by  decentralized  dynamic  controllers  whenever 
they  can  be  placed  in  the  same  locations  by  a  global  dynamic  controller. 
Although  the  class  of  interconnected  dynamical  systems  is  considerably 
smaller  than  the  class  of  decentralized  systems  studied  by  Wang  and 
Davison  ti  al..  the  design  of  the  local  controllers  for  the  components  of 
an  interconnected  dynamical  system  is  the  “physical  problem"  which 
usually  motivates  the  study  of  the  general  decentralized  control  problem. 
As  such,  we  believe  that  the  above  result  is  significant 

The  class  of  interconnected  dynamical  systems  which  we  consider  is 


characterized  schematically  in  Fig.  1 
equauons 

1  and  mathematically  by  the  set  of 

X,  -  A:X,  +  B,J, 

y,-c,xt 

n 

1-1.2.- (6) 

Of"  2  L'Hj  +  u, 

Here  the  first  two  equauons  represent  the  dynamics  of  the  ith  compo¬ 
nent  whereas  the  third  equation  defines  the  interconnection  structure  in 
which  the  input  to  the  ids  component  is  taken  to  be  a  linear  combination 
of  the  outputs  of  the  various  components  (including  the  ith)  and  an 
external  control.  The  fact  that  the  control  inputs  u,  are  not  multiplied  by 
compensator  matrices  implies  that  the  local  controllers,  given  by  (2), 
have  full  access  to  the  inputs  of  the  individual  system  components,  and 
similarly,  that  the  output  of  the  individual  system  components  is  fully 
accessible  to  the  controllers. 

For  notauonal  brevity  (6)  may  be  restated  in  block  matrix  form  as 


where  Jf-coIfX,),  a -col  (a,),  y  — col(y,).  u-col(u,),  A  -diag  (/<,),  a  — 
diag(£,),  C-diag(C,).  and  f.-raat(f.v).  Combining  these  into  a  single 
equation  for  the  overall  composite  system,  we  obtain  the  composite  state 
model 


X-FX+Bu 

y-CX 

(*) 

where 

/•- 

A  +  BLC. 

(9) 

Moreover,  upon  observing  that 

Bu 

-  2  3‘u, 

1  •  1 

(10) 

and 

X-C'AT 

i  —  1,2.-  •  •  .n 

(ii) 

where  B  ‘  -  col  (0.  0.  •  •  • 

.  o.  a,.  o,  •  •  • 

.  0)  and  C'  - 

row(0.0.-  •  •  ,0.C,.0,-  •  ■  .0),  we  see  that  (S)  naturally  decomposes  into  the 
form  of  the  decentralized  control  problem  of  (1).  In  (8),  however,  the  B‘ 
and  C,  matrices  take  on  a  special  form,  whereas  they  are  arbitrary  in  (1). 
Indeed,  it  is  this  special  form  which  yields  the  desired  equality  in  (3). 
Intuittvely,  this  implies  that  the  ith  local  controller  may  drive  only  the 
state  of  the  ith  system  component,  although  that  state  may.  in  turn,  drive 
the  remainder  of  the  system  through  the  connection  equations.  Similarly, 
the  ith  local  controller  may  observe  only  the  state  of  the  ith  component, 
with  the  remaining  components  being  observed  only  indirectly  through 
the  sute  of  the  ith  component 

Main  Theorem 

Using  the  above  notation,  our  main  theorem  may  be  stated  as  follows. 
Thtortm:  For  the  system  of  (8) 

9d(F.B.C)-9{F,B.C)-0(A.B.C)-  U  0{Al.Bi.C,).  (12) 

I 

Proof;  To  show  that  HF,B,C)mAfA.B.C)  we  simply  observe  that 

\{F+BKQ‘\(A  +  B(L  +  JC)0-MA  +  BK-C)  (13) 

where  AC  -  L*  K.  As  such,  the  same  set  of  matrices  are  spanned  if  one 
takes  the  intersection  of  the  A( F+  BKC)  over  K  or  the  intersection  of  the 
HA  +  BK'C )  over  K'.  and  hence  9{F.B.C)m 9(A,B,C).  Moreover,  since 
A,  B.  and  C  are  block  diagonal.  9(A.B.C )  is  just  the  union  of  the  fixed 
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modss  associated  with  each  block.  Given  (5)  to  prove  the  validity  of  the 
first  equality  of  (12),  it  suffices  to  show  that  t^F.B.OcHF.B.C)-  For 
this  purpose,  we  desire  to  show  that  if  X  is  not  in  9{F.B,  C)  then  it  is  not 
in  X/F.B.C).  Initially,  we  assume  that  A.  B.  and  C  are  partitioned  as  2 
by  2  matrices,  the  general  case  following  therefrom  by  induction.  If  X  is 
not  in  HF.B.C X  then  there  exists  a  K  (dependent  on  X)  such  that  del 
(Xl-(/+MC))wO  and  we  desire  to  construct  a  block  diagonal  Kt, 
also  dependent  on  X.  such  that  det  (XI  ~(F+  BKdC))+ 0.  To  this  end  we 
write  out  the  matrix  F+BKC  in  partitioned  form  and  expand  its 
determinant  via  the  formula  for  the  determinant  of  a  2  by  2  partitioned 
matrix  (3J.  obtaining 

0*det(XI -(f+BifC))-det(Xl -(A  +  B(L  +  K)C)) 

^  \l-A,-BlL"Cl-B,K"Ci  ! 

-B^'q-BjA^'C,  ;  XI  —  — 

-det(Xl  -  -  fl,f."C,  -  B,  Jf"C,) 

•det  [XI  BjZ-^Ci-  BjJC^Bji^C,  +  fl2ff21C,) 

•  (XI  -  <4,  -  fl,L' 'C,  -  B./fC.)  ’ '(  B,L,JC2 -t- B.JC  IJC,)  ] 
-det(Xl-ifI-B,I,,Cl -B,/f  ,lC,)det[xi -ifj- BjZ-^Cj 

-B,XaCj-B,lllC1(Xl-i<l-BlI,'C1-BIXr"Cl)''B12.IJCj] 

(M) 


Ar^+xr^c^xi  -Ay  -  bx"C\  -  b,b:,,c1)_,b,z.,j 
+il,c,(xi  -a  ,  -  b,/,"c,  -  b1a:i,c1)'‘b,a:'1 

+  AT2 'C,(Xl  -  A ,  -  B,  L  "C,  -  B,  K 1  'C,)  " 1  B,if 12 


L"  \  f.12 

t2'1':*!22 


struct  a  real  Kd  here.  Also  note  that  we  have  assumed  that  the  upper 
left-hand  corner  of  the  matrix  of  ( 14)  is  nonsingular. 

To  extend  the  above  argument  from  2  by  2  partitioned  matrices  to  n 
by  n  partitioned  matrices,  we  repeat  the  above  construction  n  - 1  times 
as  follows.  Given  an  n  by  n  matrix 


K"  1  XT12  '  If12  1 

-.t-.i-.-L 

if 21  •  Kv  1  if22  ! 

- I.  _  _  -l _ I. 

Jf2‘  t  Kn  i  Ka  t 

- |-_-4 - 1_ 


_ « _ i _ !_ 

if"'  I!  jc*2  ;  xf"2  | 


i  Kt. 

JL . 

*l  Til' 

J. . 

I  KU 

-L . 


■+ . 

;  k~ 


such  that  det(Xl  —  F+  BKC)+ 0  it  is  partitioned  into  a  2  by  2  matrix  as 
shown  by  the  double  line  whence  the  above  argument  is  employed  to 
formulate  a  matrix 


XT" 

II 

0  1  0  1  •• 

1  0 

■—  — e 

J 

II 

II 

ll_ 

II 

H 

II- 

1m: 

0 

II 

1! 

22  23  1 

K  ,  if  | 

!*fc 

—  — 

1! 

~  “1  ~  “  T 

*T - 

0 

Il 

XT  1  K  1 

1  R3’ 

i 

4 — 

::  if"1 i  if** 


such  that  det (XI  -(F+  BATC))# 0.  This  matrix  is  then  repartitioned  into 
a  new  2  by  2  matrix  as  shown  by  the  double  line  in  (22)  and  the  process 
is  repeated.  Since  the  1-1  entry  in  the  partitioned  matrix  is  not  affected 
by  the  process,  this  results  in  a  new  matrix  of  the  form 


„  K".  i  if'2 

A  ■ - r  -  -  -  - 

if21  if22 


are  partitioned  to  be  conformable  with  A.  B,  and  C.  Now,  if  we  define 
Kd  via 

fif"!  0 

-j  (lg) 


- 1 - -U _ L 

o  ,v:  o ; 

=  =  =1  =  =  Tt  =  =_=3- 

0  l  0  M  Af”  I 


L 0  i  *  J 

and  compute  det (XI  -(/>  BKdC)\  we  obtain 

det(Xl  ~{F*  BJf<C))-det(Xl-(i4  +  fl(L+  Kd)C)) 

Xl-4,-B,2.,lC,-BtJfnC,  ;  -B,i.l2C2 

-det - - - -  - 

- B2Z.21C|  :  Xl-/tj-fl2L2JCj-B2£  C2 

-  det  (X 1  -  4 ,  -  B,  I  "C,  -  B,  if  "  C, ) 

det  [XI  -  A ,  -  Bjf.2^,  -  B2  K  BC2 

-B2Z.2,C1(Xl-41-fl1iL"C1-Blifl,C1)"Blil2C2] 

•det(XI  -(/■+  BifC))— 0.  (19) 

Thus,  there  is  at  least  one  block  diagonal  Kd  matrix  such  that  X  is  not  an 
eigenvalue  of  F+  BKdC  showing  that  X  is  not  in  9/F.B.C).  Note  that, 
in  general.  X*  is  complex  even  for  a  real  Xf  if  X  is  complex.  To  obtain  a 
real  X*  one  would  have  to  work  with  complex  conjugate  pairs  of 
eigenvalues  rather  than  single  eigenvalues.  Since  this  would  further 
complicate  an  already  complex  derivation,  we  will  not  attempt  to  con- 


such  that  (XI  -(F>  BXfC)—0.  Repeating  the  process  it- 1  times  eventu¬ 
ally  results  in  a  block  diagonal  matrix  Kd  such  that  det(Xl -(F+  BKdC) 
— 0,  showing  that  if  X  is  not  in  9(F.B,C X  then  it  is  also  not  in 
l/.F,B,C\  thereby  verifying  that  9/F.B.C)  is  contained  in  9{F,B.C) 
and  completing  the  proof  of  the  theorem. 

Given  the  theorem,  for  the  class  of  interconnected  dynamical  systems, 
local  control  is  just  as  good  a  global  control  from  the  point  of  view  of 
pole  placement  From  the  point  of  view  of  optimal  control,  howevei, 
global  control  will,  in  general,  still  be  superior  since  it  gives  one  a  greater 
range  of  options  (4). 
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Abstract 

This  paper  suggests  an  approach  to  optimally  selecting  the  weighting  matrices  in 
a  Kalman  regulator,  by  minimizing  an  arbitrarily  defined  performance  index. 


1.  INTRODUCTION  ANO  PROBLEM  STATEMENT 

The  solution  of  a  common  class  of  control  problems 
involves  minimizing  a  functional  J,  termed  the 
“performance  index"  over  a  class  of  functions 
which  are  the  allowable  inputs  to  the  system.  In 
general,  J  represents  a  compromise  between  the 
system's  performance  and  the  energy  content  of  the 
system's  inputs.  More  precisely,  the  problem  may~ 
be  formulated  as  follows 

u(t)  J  *  tx'(t)  Q  x(t)  ♦  “'<*)  R  u<t)}  dt 

*0  (1.1) 

subject  to 

x(t)  *  F(t)  x(t)  +  G(t)  u(t)  (1.2) 

*(t0)  “V  0-3> 

The  0  and  R  matrices  in  Equation  (1.1)  are  called 
the  “state  weighting  matrix"  and  the  “input 
weighting  matrix",  respectively.  Both  terms  in 
the  integrand  of  Equation  (1.1)  can  be  thought  of 
as  cost  functions.  The  x  * ( t )  Qx(t)  term  repre¬ 
sents  the  energy  expenses  needed  to  achieve  such 
a  trajectory.  Very  often  the  off-diagonal  entries 
in  the  Q  and  R  matrices  are  assumed  to  be  zero. 
Then,  the  two  matrices  also  represent  measures  of 
relative  importance  among  the  states  and  the 
various  inputs  of  the  system.^ 


Thus,  the  designer  must  select  the  two  weighting 
matrices  Q  and  R.  Frequently,  the  selection  is 
made  by  trial  and  error. ^  An  approximate  pro-, 
cedure  is  suggested  in  (2).  In  this  technique, 
an  entry  in  Q  is  arbitrarily  selected  and  the  re¬ 
maining  entries  in  Q  are  R  are  obtained  by  assum¬ 
ing: 

(1)  The  maximum  contributions  to  J  by  the 
x'(t)  Qx(t)  must  occur  simultaneously  in  time. 

(2)  The  total  contribution  of  the  x'(t) 

Qx(t)  term  must  equal  the  total  contribution  of 
the  u'(t)  Ru(t)  term. 

Obviously,  fie  first  assumption  is  not  always 
valid,  wheraas.  the  second  is  indeed  quite  arbi¬ 
trary.  In  some  >iSes,  the  Q  and  R  matrices  are 
obtained  by  solving  the  Inverse  Control  Problem 
where  a  linear  control  law  is  assumed  for  the 
feedback  and  then  the  weighting  matrices  are  ob¬ 
tained  from  Equation  (1.1)  subject  to  this  con¬ 
dition. (3,4) 

In  any  case,  the  selection  of  weighting  matrices, 
although  a  matter  of  experience  and  ingenuity,  is 
generally  suggested  by  factors  external  to  the 
system.  In  most  cases,  the  plan  S,  which  is  to 
be  control  led, is  a  subsystem  of  a  general  system  z 
and  so  it  is  factors  in  r  which  dictate  the  per¬ 
formance  specifications  of  S.  Thus,  Q  and  R 
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could  be  selected  by  examining  the  effect  that  the 
perfomence  of  S  has  upon  c. 

Hence,  consider  a  linear  dynaalc  system  S  which  Is 
a  subsystem  of  a  system  c,  as  shown  In  Figure  1-1. 
Assuee  that  S  Is  controlled  by  a  Kalman  regulator 


Figure  1-1.  A  General  System  t 


where  the  Q  and  R  matrices  are  to  be  specified. 
Assume  that  a  functional  on  r  can  be  defined  as 


-  F'U)P(t)  -  P(t)F(t)  ♦ 

♦  PUWUR"1  G'(t)P(t)  -  Q  (1.10) 

P(t,)  -  0  (1.11) 

Then  substituting  Equation  (1.9)  Into  Equation 
(1.7)  x*(t)  Is  given  by^ 

x*(t)  -  ♦^t.toJXo  (1.12) 

irfiere  ♦jjt,^)  1$  the  solution  of 

fyt.t0)  -  [F(t).  GttJR^GftOPdlJe^t.t,,)  (1.13) 

^(t.t)  -  I.  (1.14) 


h(x(t),  u(t),  t)  dt 


(1.4) 


where  h( •,*,•)  Is  some  given  function,  possibly 
nonlinear.  Then  the  problem  may  In  general  be 
formulated  as  follows: 


Min  J.  -  /  *f  h(x*(t),  u*(t) ,  t)  dt  (1.5) 

o."  E  I  ♦ 


subject  to 

Min  J-  ■  f*f  (x'(t)  Qx(t)  +  u'(t)  R  u( t) }  dt 
u(t)  )t0  (1, 


subject  to 

x(t)  ’  F(t)  x(t)  *  G(t)  u( t)  (1#7) 

*(t0)  »  x0  (1.8) 

where  u*(t)  is  the  optimal  solution  of  Equation 
(1.6)  and  x*(t)  Is  obtained  from  Equation  (1.7) 
and  Equation  (1.8)  by  setting  u(t)  »  u*(t).  To 
ensure  a  unique  solution  of  Equation  (1.5),  It  Is 
assumed  that  h( •,*,•)  Is  convex  In  Q  and  R  and 
that  one  of  the  entries  of  R  Is  arblrarlly  set  to 
1. 


Since,  In  general,  h(x*(t),  u*(t),t)  Is  not  given. 
It  appears  on  first  sight  as  If  one  has  traded  the 
arbitrariness  In  selecting  Q  and  R  for  the  arbi¬ 
trariness  In  selecting  the  performance  Index  Jj.. 
Although  this  Is  In  fact  true,  the  non-linearity  of 
h(x*(t),  u#(t),  t)  makes  the  selection  of  much 
easier  than  the  selection  of  Q  and  R.  In  fact, 
concepts  from  utility  theory  and  decision  theory, 
are  readily  applicable  thus  facilitating  the  se¬ 
lection  of  Jj.^ 

On  the  other  hand,  the  problem  of  Equation  (1.5)  Is 
to  be  solved  only  once  over  extended  periods  of 
time.  Thus,  though  h(x*(t),  u*(t),  t)  Is  In  gen¬ 
eral  a  complicated  functional,  the  computational 
costs  are  not  prohibitive. 

2.  SCAUR  EXAMPLES 

It  Is  assumed  that  the  system  dynamics  are  given  as 


x(t)  ■  -x(t)  ♦  u(t)  (2.1) 

*(tQ)  -  x0  (2.2) 

and  the  S  performance  Index  Is  given  as 

JS  “  jl{4*2(t)  ♦  “2(t)}  dt.  (2.3) 

The  Rlccatl  equation  Is 

d  P(q,t)  -  -2P(q,t)  ♦  P*(q.t)-9  (2.4) 

dt 


where  P(t)  Is  the  solution  of  the  matrix  Rlccatl 
Equation 


P(q.l)  •  0 


(2.5) 


Following  a  Method  described  In  (3),  an  analytic 
solution  for  Equation  (2.4)  Is  obtained  by  solving 
the  au^Mnted  differential  equation 


k  e(t 


-1  -1 

.T)  • 

-q  1 


6(t.T) 


e(t,t)  •  I  (2.7) 

The  eigenvalues  of  Equation  (2.6)  are 

*1,2  "  -  ^  7  1  •  (2-8) 

The  eigenvectors  associated  with  the  eigenvalues 
of  Equation  (2.8)  are 


1 

1 

-(1  ♦  •'q  +  1) 

*  « 

P2  "  _ _ _ 

-1  +  /q  +  1 

m 

|-(1  ♦  (-1  +  /q~+D  J  (2.10) 

and 

^  ^  (-1  ♦  fq  +  1 )  -1 

P  (1  WqT  1)  1  (2.11) 

Thus,  a  fundamental  matrix  for  Equation  (2.6)  is 
9(t)  »  pe^p'1  (2.12) 


,tJ  * 

0 


e-t 


(2.13) 


Thus,  a  transition  matrix  for  Equation  (2.6)  Is 
f(t,  T)  »  e(t)  9(t)_1  (2.M) 

But  since 

0*1  (T)  -  (p'V1  e-^p'1  ■  pe'TV'  (2.15) 

Equation  (2.14)  becomes 

f(t,T)  *  pe(t*T,V1  (2.16) 

Hence,  since  0(T,T)  *  I,  the  solution  of  Equa¬ 
tion  (2.6)  is  9(t,T)  »  f(t,T).  (2.17) 

Then,  the  solution  of  Equation  (2.4)  and  Equation 
(2.5)  is 

O  I  ft  ,1  .  A  f*  T  \  .  ft  /  .  T  \  ™  1  M  ,  n  , 


P(q,t)  *  021(t,T)-  On(t,Tf 


(2.18) 


Substituting  and  performing  the  Indicated  calcula¬ 
tions, 

q[.e(t-l)/q7T  +  #-(t-l)/q*TJ 

P(q*t)  “(-1  ♦  A&T'W'V'  > 

(1  ♦  f57TTe'(t'1,^Tr  (2J9) 


Hence,  by  (3),  the  optimal  Input  Is 
u*(t)  ■  -P(q,t)x(t) 


(2.20) 


Then,  substituting  Equation  (2.20)  Into  Equation 

(2.1). 


^  x*(t)  -  -(1  +  P(q,t)x*(t) 


(2.21) 


Integrating  Equation  (2.21)  and  using  Equation 

(2.2). 

x*(t)  *  xQ  •  exp  [-t-  f*  P(q,o)  do]  (2.22) 

'0 

It  Is  now  assumed  that  the  t  performance  Index  Is 
of  the  form 


Case  1 


•  f1  <[1  *  x*(t)]2  +  [u*(t)]2)dt 
L  (2-23) 


Thus,  substituting  Equation  (2.20)  and  Equation 
(2.22)  into  Equation  (2.23) 

Jj.  *  |1{[l.-x0e*(t  +  fQ  P(q-«)«*a)]2  + 

+  C-P(q.t)x  e*(t  *  f  p(9.°)<fc))2}e|t  (2. 


(2.24) 


where  P(q,t)  is  given  by  Equation  (2.19).  The  q 
minimizing  Equation  (2.24)  Is  calculated  for  sever¬ 
al  values  of  xQ,  using  numerical  integration  tech¬ 
niques. 

At  this  point  a  brief  discussion  of  the  problem  may 
be  useful.  Refer  to  Figure  2-1.  As  indicated  the 
performance  required  from  the  system  is  indeed 
quite  unrealistic.  But  this  should  be  expected 
since  J.  does  not  necessarily  consider  the  phy¬ 
sical  limitations  of  the  system.  Furthermore, 
given  any  initial  condition  xQ,  the  optimal  se¬ 
lection  of  q  is  not  obvious. 

In  Figure  2-2,  the  q  minimizing  Equation  (2.24)  is 

plotted  versus  the  initial  condition  x_.  As  seen 

0 


as  x  -  q 


1.  This  is  reasonable  since 
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Figure  2-1.  Required  Venus  Actual  Performance. 
Solid  Lines  are  the  Required  Performance  Whereas 
Dotted  Lines  are  the  Actual. 


Figure  2-2.  Optimal  q  Venus  Initial  Condition  xQ. 

for  |xp|  being  very  large, 

(x(t)-l)2  ^x(t)2  (2.25) 

and  hence,  equal  importance  Is  placed  upon  the 
state  and  the  Input  terms  of  J£,  Equation  (2.23). 

As  xg  goes  to  zero  from  the  positive  side,  the 
state  approximates  Its  requirement  better  and 
better  and  hence  the  regulator  Is  "Instructed"  to 
emphasize  the  Input  cost  by  decreasing  q.  For  xQ 
between  0  and  1,  qQpt  Is  equal  to  zero.  This  also 
was  to  be  expected  since  for  positive  Initial 
value  the  response  of  the  first  order  system  Is  al¬ 
ways  bounded  above  by  Its  natural  response. 

Hence,  the  regulator  Is  "Instructed"  to  totally 
"Ignore*  the  state  In  an  attempt  to  force  the 
system  to  achieve  the  state  upper  bound.  When 
xp  •  0.  by  Equation  (2.20)  and  Equation  (2.22), 
x*(t)  »  u*(t)  •  0.  Thus,  by  Equation  (2.3),  qQpt 
Is  Indeterminate.  Once  xp  becomes  slightly  nega¬ 
tive.  the  regulator  Is  "Instructed"  to  drive 
x(t)  to  zero  as  soon  as  possible  since  x(t)  Is  now 
adding  to  the  error.  As  xp  becomes  more  and  more 
negative,  the  error  In  the  Input  becomes  signifi¬ 
cant  also,  and  thus,  as  stated  earlier,  qQpt 
approaches  1.  Thus,  It  was  seen  that  even  in  this 
case  where  one  normally  would  not  use  a  Kalman 
regulator,  the  results  agree  with  Intuition. 


(Uj*  2  Jj.  •  f1  (d  -  t  -  x*(t)]2  +  [u*(t)]2)dt 

0  (2.26) 
A  Kalman  regulator  '»  3  more  "natural"  controller 
for  this  case.  Yet  the  ccaraents  made  for  the  pre¬ 
vious  case  are  also  applicable  here.  Figure  2-3 
shows  required  versus  actual  performance  and 


Figure  2-4  shows  the  q  minimizing  Equation  (2.26) 
versus  the  Initial  condition  xp. 


Figure  2-3.  Required  Versus  Actual  Performance. 
Solid  Lines  are  the  Required  Performance  Whereas 
Dotted  Lines  are  the  Actual. 


Figure  2-4.  Optimal  q  Versus  Initial  Condition  xp. 
3.  APPLICATIONS 

The  proposed  approach  could  be  applied  In  several 
real  world  situations.  One  such  case  Is  the  land¬ 
ing  of  an  aircraft,  where  a  Kalman  regulator  could 
be  used  to  drive  the  aircraft  from  some  Initial 
state  xp  to  a  zero  state.  Generally  speaking,  one 
Is  Interested  In  a  safe,  comfortable  and  economical 
landing.  Towards  achieving  these  goals,  certain 
restrictions  have  been  or  could  be  Imposed  on  the 
states  and  the  Inputs  of  the  system,  either  by  law 
or  regulation  or  as  a  matter  of  policy  or  merely 
due  to  the  physical  limitations  of  the  aircraft. 
These  restrictions  are  then  used  to  determine  the 
overall  system  performance  of  JT. 

Another  case  Is  the  control  of  a  large  system  con¬ 
sisting  of  a  number  of  Interconnected  subsystems. 
Several  schemes  have  been  proposed ^ 5 ’** 7 ^  where  a 
Kalman  regulator  Is  used  to  control  each  subsystem 
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Independently  from  the  others.  The  main  differ¬ 
ences  in  these  schemes  arise  in  how  should  the 
couplings  be  handled.  More  often  than  not,  the 
assumption  of  weak  couplings  is  made.  In  the 
numerical  approach  suggested  here,  no  assumption 
is  made  about  the  nature  of  the  couplings  and  thus 
the  method  is  equally  applicable  to  systems  with 
either  strong  or  weak  couplings.  The  J£  Index 
could  even  be  assumed  to  be  quadratic  in  this 
case.  So,  assuming  that  the  Q  and  R  matrices  for 
the  overall  system  are  given,  one  seeks  the 
weighting  matrices  for  each  of  the  subsystems. 

To  efficiently  perform  the  desired  minimization, 
an  efficient  algorithm  for  the  calculation  of  the 
solution  of  the  Rlccati  equation  Is  desired  for 
the  various  entries  In  the  Q  and  R  matrices.^ 
Though  not  presented  here,  an  algorithm  has  been 
derived  where  the  solution  P  is  calculated  by 
first  obtaining  the  value  of  P  for  the  Initial 
point  C q„ , rQ , tQ )  and  then  solving  two  linear,  time- 
invariant  differential  equations  to  obtain  the 
value  of  P  at  some  other  point  (q,r,t). 

In  fact,  it  was  the  efficiency  of  this  calculation 
that  dictated  the  choice  of  the  present  approach 
over  the  alternative  offered  by  the  Inverse 
Control  Problem,  since  In  the  latter  case,  the 
optimization  had  to  be  carried  over  a  time  vary¬ 
ing  matrix  K(t),  rather  than  a  time  independent 
matrix  (Q  +  R),  a  task  considerably  more  diffi¬ 
cult. 
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10.  Abstract  of  "Feedback  Systems  Design:  The  Factional  Representation  Approach 

. .  to  Analysis  and  Synthesis"  by  Desoer,  C.A.,  Liu,  R.-W., 

Murray, J.J.  and  R.Saeks,  to  appear  in  the  IEEE  Transactions 
on  Automatic  Control. 


The  problem  of  designing  a  feedback  system  with  prescribed  properties 
is  attacked  via  a  fractional  representation  approach  to  feedback  system 
analysis,  and  synthesis.  To  this  end  we  let  H  denote  a  ring  of  operators  with 
the  prescribed  properties  and  model  a  given  plant  as  the  ratio  of  two  operators 
in  H.  This,  in  turn,  leads  to  a  simplified  test  to  determine  whether  or  not  a 
veedback  system  in  which  that  plant  is  embedded  has  the  prescribed  properties 
and  a  complete  characterization  of  those  compensators  which  will  "place"  the 
feedback  system  in  H.  The  theory  is  formulated  axiomatically  to  permit  its 
application  in  a  wide  variety  of  system  design  problems  and  is  extremely 
elementary  in  nature,  requiring  no  more  than  addition,  multiplication,  sub¬ 
traction,  and  inversion  for  its  derivation  even  in  the  most  general  settings. 


11.  Abstract  of  "Suboptimal  Control  with  Optimal  Quadratic  Regulators"  by 
C.  Karmokolias  and  R.  Saeks. 


The  purpose  of  the  present  paper  is  to  describe  an  approach  to  the  con¬ 
trol  system  design  problem,  wherein,  one  designs  a  quadratic  regulator  for 
an  approximation  of  the  given  system  but  chooses  the  weighting  matrices 
for  the  regulator  to  optimize  its  performance  as  a  controller  of  the  actual 
system,  relative  to  a  prescribed  (not  necessarily  quadratic)  performance 
measure.  The  advantage  of  such  an  approach  is  that  the  resultant  regulator 
has  the  same  "ease  of  implementation"  and  most  of  the  "stability  characteristics 
associated  with  the  classical  LOG  problem.  The  advantage  is  that  the  system 
performance  is  suboptimal. 


12.  Abstract  of  "Suboptimal  Design  of  an  Aircraft  Landing  Systems"  by 
C.  Karmokolias  and  R.  Saeks. 


The  design  of  an  aircraft  landing  system  is  carried  out  via  a  sub- 
optimal  algorithm.  In  particular,  a  specific  optimal  controller  is  obtained 
by  restricting  the  design  to  the  class  of  quadratic  regulators  for  an  un¬ 
constrained  linear  approximation  to  the  given  system.  A  suboptimal  controller 
is  then  chosen  from  within  that  class  by  optimizing  over  the  choice  of  weight¬ 
ing  matrices.  The  resultant  control  system  compares  well  with  previous  designs 
and  was  obtained  without  undue  computational  effort. 
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presently  investigating  the  possible  extension  of  this  theory  to  the  observ¬ 
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x  =  f(x)  2. 
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Controllability  of  General  Nonlinear  Systems 

L.  R.  Hunt* 

Department  of  Mathematics.  Texas  Tech  University.  Lubbock,  Texas  79409 


Abstract.  Consider  the  nonlinear  system 
x(t)  - /(*('))  +  2  “;(f)g,(*(0).  *(0) 

i-i 


where  M  is  a  C®  real  rt-dimensional  manifold,/,  g, . g„  are  C®  vector 

fields  on  A/,  and  u, . um  are  reaL-valued  controls.  If  m  —  n-  1  and/, 

g, . gm  are  linearly  independent,  then  the  system  is  called  a  hypersurface 

system,  and  necessary  and  sufficient  conditions  for  controllability  are 
known.  For  a  general  m.  I  <m  <n-  1.  and  arbitrary  C®  vector  fields.  /, 

g, . g„.  assume  that  the  Lie  algebra  generated  by/,  . . g„  and  by  taking 

successive  Lie  brackets  of  these  vector  fields  is  a  vector  bundle  with 
constant  fiber  (vector  space)  dimension  p  on  M.  By  Chow’s  Theorem  there 
exists  a  maximal  C®  real  ^-dimensional  submanifold  S  of  M  containing  x0 
with  the  generated  bundle  as  its  tangent  bundle.  It  is  known  that  the 
reachable  set  from  x0  must  contain  an  open  set  in  S.  Tne  largest  open  subset 
U  of  5  which  is  reachable  from  x0  is  called  the  region  of  reachability  from 
x0.  If  O  is  an  open  subset  of  S  which  is  reachable  from  Xq,  we  find  necessary 
conditions  and  sufficient  conditions  on  the  boundary  of  O  in  5  so  that 
<?-£/.  Best  results  are  obtained  when  it  is  assumed  that  the  Lie  algebra 
generated  by  g, . g„  and  their  Lie  brackets  is  a  vector  bundle  on  M. 
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L.  R.  Hunt 


I.  Introduction 


Let  f,  g . .  be  C*  vector  Tields  on  a  connected  C"  real  n-dimensional 

manifold  M.  If  u, . um  are  real-valued  controls,  then  we  examine  the  controlla¬ 

bility  of  the  system 

M‘)  “  AMO)  +  2  «,(')&(•*(')).  MO)  =  x0  S  M. 

i-  I 

The  system  is  called  a  hypersurface  system  if  m  =  I  and  if  /.  g . g„_,  are 

linearly  independent  on  M ,  and  necessary'  and  sufficient  conditions  for  control¬ 
lability  are  given  in  [5].  For  a  general  m.  \<m<n- 1.  we  know  that  the 
reachable  set  of  this  system  contains  an  open  subset  of  the  real  p-dimensional 
submanifold  S  of  M  given  to  us  by  Chow’s  Theorem.  Here  we  assume  that  the 

Lie  algebra  generated  by  /,  g, . gj,  and  by  taking  successive  Lie  brackets  of  /, 

g,. _ gm  is  a  vector  bundle  with  constant  fiber  (vector  space)  dimension  p  on  M 

and  that /.  gt,...,gm  are  complete  vector  fields  bn  M  (which  may  not  be  linearly 
independent  on  M). 

Let  U  be  the  largest  open  subset  of  5  which  is  reachable  from  x0.  We  call 
such  a  set  the  region  of  reachability  from  x0.  If  U^S.  we  find  necessary 
conditions  on  the  boundary  of  0(90)  in  5  so  that  0  is  the  region  of  reachability 
from  x0.  If  0  is  an  open  subset  of  5  containing  x0  which  is  reachable  from 
we  give  sufficient  conditions  on  <30  in  5  so  that  0=0  for  certain  systems. 

The  hypersurface  case  is  the  nice  case  in  which  complete  answers  can  be 
derived.  In  section  2  of  this  paper  we  discuss  the  results  from  [5]  for  hyper¬ 
surface  systems.  Section  3  contains  necessary  conditions  that  an  open  subset  of 
5  be  the  region  of  reachability  from  .t0  for  a  general  system.  In  section  4  we 
discuss  sufficient  conditions  on  the  boundary  of  an  open  set  in  5  so  that  this 
open  set  is  the  region  of  reachability. 


II.  Hypersurface  Systems 

Even  though  the  results  of  this  section  are  restricted  to  hypersurface  systems 
(with  linearly  independent  vector  fields/,  g, . g„_ ,) 

/»—  I 

x(r)  =/(*(/))+  2  x(0)  =  .roe,V/  (2.1) 

i  - 1 

our  definitions  (with  the  exception  of  Definitions  2.3  and  2.4)  apply  to  general 
systems 

x(t)  = f(x(t ))  -r  2  ui(Dg,{.x(t)).  x(0)  =  .v0€  \f.  (2.2) 

i-i 

Let  T(\t)  denote  the  tangent  bundle  of  M  with  fiber  Tt  (.V/)  for  .r  6  M.  If  .V 
is  a  vector  field  on  M  (i.e.  .V  is  a  global  section  of  T(M))  then  a  is  an  integral 
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curve  of  X  if  a  is  a  C"  mapping  from  a  closed  interval  /  cR  into  M  such  that 
for  all  (6/. 


Definition  2.1  [8].  If  D  is  a  subset  of  T( M ),  then  an  integral  curve  of  D  is  a 
mapping  a  from  a  real  interval  [/,/']  into  M  such  that  there  exist  /“/<,</, 

<  ...  </*“/'  and  vector  fields  X . Xk  in  D  with  the  restriction  of  a  to  [/,_„/,] 

being  an  integral  curve  of  X ,,  for  each  i  — 1,2 . k. 

Definition  2.2.  Let  D  be  a  subset  of  T(M)  and  let  x0 € M.  A  point  xSM  is 
D-reachable  from  x0  if  there  is  an  integral  curve  a  of  D  and  some  T>  0  in  the 
interval  for  a  such  that  a(0)«  x0  and  a(T)  —  x.  A  subset  A  of  M  is  D-reachable 
from  x0  if  every  point  .r£/l  is  reachable  from  x0. 

Since  the  D  we  consider  is  the  subset  of  T(M)  given  by  the  vector  fields  in 
(2.1)  or  (2.2)  we  drop  the  D  from  D-reachable.  For  hypersurface  systems  as  (2.1) 
we  know  that  there  is  an  open  set  in  M  which  is  reachable  from  x0  and  which 
contains  x0  in  its  closure. 

Definition  2 3.  The  largest  open  subset  U  of  M  which  is  reachable  from  x0  for 
system  (2. 1 )  is  called  the  region  of  reachability  from  x0.  If  U  ■  M,  we  say  that  the 
system  is  controllable  from  x0. 

Definition  2.4.  Let  O  be  an  open  subset  of  M  and  let  xGdO.  Then  f  points  in 
the  direction  of  O  (or  towards  O)  at  x  if  there  exists  an  open  neighborhood  IV  of 
x  in  M  such  that  the  vector  assigned  by  /  at  x.  projected  into  M  (by  the 
exponential  map),  and  intersected  with  W  —  {x)  is  contained  in  O.  If  this  is  true 
for  every  x  £90.  then  /  points  in  the  direction  of  O  on  3 O. 

If /points  in  the  direction  of  O  at  .rS3  O  and  if  30  is  C1  near  x.  then  there 
is  some  open  neighborhood  W  of  x  in  M  so  that  the  integral  curve  of/  (moving 
in  positive  time)  starting  at  x  and  intersected  with  W—  {.x}  is  contained  in  O. 

The  following  result  from  [5]  gives  necessary  and  sufficient  conditions  on 
the  boundary  of  an  open  set  of  M  for  this  set  to  be  the  region  of  reachability  of 
a  hypersurface  system.  Recall  that  a  C*  submanifold  B  of  M  is  an  integral 

manifold  of  the  linearly  independent  vector  fields  Y , . Yk  on  M  if  T,(B)  is  the 

space  spanned  by  F, . Yk  aty  for  each  point  y€S. 

Theorem  2.5  [5].  Suppose  x0  6  M  and  O  is  an  open  subset  of  \t  which  contains  x0 
in  its  closure  and  which  is  reachable  from  x0.  Then  O  is  the  region  of  reachability 
V  from  x0  of  the  system  (2.1)  if  aril  only  if  3 O  is  an  integral  manifold  of 

g  . . g„_,  and  f  assigns  vectors  on  3  O  which  point  in  the  direction  of  O.  In  fact , 

the  smallest  open  subset  U  of  M  with  .t0  €  U  ( the  closure  of  U)  satisfying  3  U  is  an 

integral  manifold  of  g, . and  f  points  in  the  direction  of  U  on  31/  is  the 

region  of  reachability  from  x0. 

As  a  corollary  we  proved  in  [5]  that  if  there  is  no  integral  manifold  of 
g, . g„_,  which  disconnects  M  in  some  sense,  then  the  system  (2.1)  is  control¬ 

lable  from  any  x0S  A/.  Also  if  such  integral  manifold  exists,  but  the  vector  field/ 
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assigns  vectors  on  each  which  always  point  in  both  directions  relative  to  the 
manifold  (i.e.  if  an  integral  manifold  N  divides  M  into  two  components,  then  / 
assigns  vectors  in  the  direction  of  one  component  at  some  points  of  N  and 
toward  the  second  component  at  other  points  of  N),  then  the  system  (2.1)  is 
controllable  from  any  x0EM. 

The  hypersurface  case  is  ideal  in  the  sense  that  we  obtain  clear  cut  solutions 
to  problems  concerning  reachability  and  controllability.  For  this  reason  ve  use 
it  as  a  model  for  our  system  (2.2),  even  though  we  don’t  get  as  satisfying  results 
in  general  due  to  the  inherent  nature  of  the  system.  The  difference  seems  to  be 

that  dU  is  an  integral  manifold  of  g, . g„_ ,  in  the  hypersurface  case,  which 

implies  that  9  (J  is  a  C®  manifold.  This  is  not  true  in  general  for  nonhyper¬ 
surface  systems. 


III.  Necessary  Conditions 


We  need  to  state  several  definitions  and  results  before  considering  the  general 
problem.  Again,  let  M  be  a  connected  C"  real  n -dimensional  manifold.  If 

/ . ,/,  are  C®  vector  fields  on  Af,  we  define  the  Lie  bracket  of  /  and  /,  1  <i. 

j<r,i¥‘j,  by 


The  set  of  vector  fields  {/,,...,/,}  is  called  involutice  if  there  exist  C®  functions 
y,jk(x)  on  M  such  that 

[fJj](X)  ®  2  Yijk(X)fk(X)' 

*-l 

We  state  the  following  version  of  the  Frobenius  Theorem  from  (1]. 

Theorem  3.1.  Let  /,(x),...,/r(.t)  be  an  involutice  collection  of  C®  linearly  inde¬ 
pendent  vector  fields  on  A t.  Given  any  point  x0  £  A/  there  exists  a  unique  maximal 
C®  submanifold  S  of  M  containing  ,t0  such  that  TX(S)  is  the  space  spanned  by 
ft(x), . . .  ,f(x)  for  every  x  £  S:  i.e.  S  is  the  unique  integral  manifold  of 
/,(*),. ..,/,(*)  through  x0  and  contained  in  M. 

It  is  well  known  that  under  the  conditions  of  the  Frobenius  Theorem.  M  is  a 
C®  (n  -  r)-parameter  family  of  C"  /--dimensional  manifolds. 

Next  we  consider  the  possibility  that  the  set  of  vector  fields /,(.t), ...,/,(. r)  is 
not  involutive.  Suppose  f(x),  -.-J,(x)  are  complete  (i.e.  the  integral  curves  of 
each  /  ar  defined  for  all  -  so  </<  oo).  Given/,  for  each  t  (exp if)  defines  a  map 
of  M  into  itself  which  is  produced  by  the  flow  on  M  defined  by  the  differential 
equation  i  */(*(/)),  and  (exp r/)jc0  denotes  the  solution  starting  at  .t0  and 
moving  in  the  positive  time  sense.  Repeated  exponentials  like  (exp iff)  (exptfx)x0 
mean  that  we  start  at  x0,  move  along  the  integral  curve  of  /,  for  positive  t  units 
of  time,  and  then  along  the  integral  curve  of  /2  for  positive  t  units  of  time.  We 
denote  the  smallest  subgroup  of  the  diffeomorphisms  of  M  with  itself  which 
contains  exp//  for  all/in  {/„...,/,}  by  {exp{/}}c. 
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Let  La  be  the  Lie  algebra  of  vector  fields  generated  by  and  by 

taking  successive  Lie  brackets  of  these  vector  fields.  We  assume  that  this  is  a 
vector  bundle  with  vector  space  dimension  p  on  M.  Defining  {exp 
{exp{/}c}x0  and  {exp{/}Lj}x0  in  the  obvious  manner,  we  state  the  following 
version  of  Chow’s  Theorem  (see  [1]). 

Theorem  32  [2].  Given  any  point  r0SM,  there  exists  a  unique  maximal  C®  real 
p-dimensional  submanifold  ScM  containing  x0  such  that  { exp {f)c)xo“ 
{exp(f}L}Gx0—S.  This  S  is  the  unique  submanifold  of  M  through  x0  having  LA 
as  its  tangent  bundle. 

Thus,  under  the  hypotheses  of  Chow’s  Theorem,  M  is  a  C®  (n—p)~ 
parameter  family  of  C“  p-dimensional  manifolds. 

We  now  return  to  our  system  (2.2)  and  let  LA  be  the  Lie  algebra  of  vector 

fields  on  M  generated  by  /,  g, . gm  and  by  taking  successive  Lie  brackets.  If  we 

could  control  the  drift  term  /  from  (2.2),  then  the  reachable  set  from  x0EM  is 
equal  to  S.  assuming  that  the  hypotheses  of  Chow’s  Theorem  are  satisfied  for 
the  vector  fields /,  g,,...,g„.  We  generalize  Definition  2.3  as  follows. 

Definition  3.3.  Let  /,  g„....gm  satisfy  the  hypotheses  of  Chow’s  Theorem  with 
the  dimension  of  LA  being  p.  If  S  is  the  C*  p-dimensional  submanifold  of  M 
through  ,t0  of  Chow’s  Theorem  and  if  /  is  treated  as  a  drift  term  without  control, 
then  the  largest  open  subset  U  of  S  which  is  reachable  from  .t0  for  system  (2.2)  is 
called  the  region  of  reachability  from  xn.  If  U—S,  we  say  that  the  system  is 
S-controllable  from  x0. 

An  obvious  question  is  that  given  a  system 


Mt)  -  f(x(t))  +  2  u,(t)g,(x{t)).  *(0)  -  x0  €  \f, 

I  m  1 


satisfying  Chow’s  Theorem,  is  there  a  nonempty  open  subset  of  S  which  is 
reachable  from  ,t0  when  /  is  the  drift  term?  This  question  is  answered  affirma¬ 
tively  by  the  proof  of  Theorem  3.1  in  [8].  Thus  Definition  3.3  is  not  vacuous.  We 
remark  that  the  work  in  [8]  is  for  real  analytic  vector  fields  and  real  analytic 
manifolds.  However,  the  advantage  of  real  analytic  over  C®  is  that  the  Jacobian 
matrix  of  a  real  analytic  map  that  has  maximal  rank  at  some  point  must  have 
maximal  rank  at  almost  all  points  in  the  appropriate  sense  (see  [8]).  In  [7]  Krener 
gives  an  interesting  proof  of  the  existence  of  a  reachable  open  set  in  S. 

Unlike  the  hypersurface  case,  we  do  not  require  that  /,  . . .  be  linearly 

independent  on  \f.  Unless  otherwise  specified,  for  the  remainder  of  this  article 

we  assume  LA  is  a  vector  bundle  with  vector  space  dimension  p.  that  /.  g, . gm 

are  complete  vector  fields,  and  that  S  is  the  manifold  through  t0  given  by 
Chow’s  Theorem.  Also,  since  we  can  use  unbounded  (both  positive  and  nega¬ 
tive)  controls,  we  may  as  well  assume  it  is  possible  to  move  along  the  integral 
curves  of  g, . gm. 

In  the  following  definition  A  and  B  are  C1  submanifolds  of  \f  of  dimen¬ 
sions  k  and  n  -  k  respectively. 
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Definition  3.4.  The  manifolds  A  and  B  intersect  transversally  at  a  point  *  6/1  n 
B  if  and  only  if  TX{A)®  TX(B)~  TX(M),  where  ©  denotes  the  direct  sum. 

We  need  a  sequel  for  Definition  2.4. 

Definition  3.5.  Let  0  be  an  open  set  in  5  and  let  *€30  (taken  relative  to  5). 
Then  /  points  in  the  direction  of  0  (or  towards  O)  at  x  if  there  exists  an  open 
neighborhood  W  of  x  in  M  such  that  the  vector  assigned  by  /  at  x,  projected 
into  5  (by  the  exponential  map),  and  intersected  with  W-[x)  is  contained  in 
0.  If  this  is  true  for  every  *6  30.  then / points  in  the  direction  of  0  on  30. 

•  If  /  points  in  the  direction  of  0  at  *630  and  if  30  is  C1  near  *,  then  there 
is  some  open  neighborhood  W  of  *  in  M  so  that  the  integral  curve  of  /  (moving 
in  positive  time)  starting  at  *  and  intersected  with  IV- {x)  is  contained  in  0. 

Definition  3.5'.  Let  0  be  an  open  set  in  5  and  let  *630.  Then  f  points  in  the 
direction  of  0  (the  closure  of  0  in  S)  at  x  if  and  only  if  there  exists  an  open 
neighborhood  \V  of  *  in  \t  such  thaj^  the  integral  curve  of  /  starting  at  *  and 
intersected  with  W  is  contained  in  0.  If  this  is  true  for  every  *6  30.  then  / 
points  in  the  direction  of  0  on  30. 

Our  first  result  concerning  necessary  conditions  parallels  Theorem  3.2  found 

in  (5).  Let  L'A  be  the  Lie  algebra  generated  by  g, . gm  and  their  Lie  brackets. 

and  let  {LA}X  be  the  restriction  of  LA  to  the  point  *. 

Theorem  3.6.  Let  O  be  an  open  set  in  S  which  is  reachable  from  x0  for  system 
(2.2),  and  let  x  be  an  arbitrary  point  in  30.  Suppose  there  is  an  open  neighborhood 
IV  of  x  in  \f  such  that  W n  30  is  a  Cl  real  (p-\)-dimensional  submanifold  of  S.  If 
any  one  of  the  following  conditions  holds .  then  0  is  not  the  region  of  reachability 
from  *„: 

i)  the  dimension  of  (LA) x  as  a  vector  space  is  p. 

ii)  the  integral  curve  of  some  g,,  1  <  /  <  m,  is  transversal  to  30  at  *. 

iii)  /  assigns  at  x  a  vector  which  does  not  point  in  the  direction  of  0. 

Proof.  The  neighborhood  If  of  *  in  M  will  be  made  smaller  whenever 
necessary. 

If  i)  holds  at  *  then  it  holds  for  all  points  in  W  n30  since  p  is  maximal  by 
the  assumption  on  LA  and  LA  CLA.  Suppose  that  each  g„  1  <i<m.  assigns  oniy 
tangent  vectors  to  W  n30:  i.e.  none  of  the  g,’s  is  transversal  at  any  point  of 

WrdO.  Then  the  Lie  algebra  generated  by  g, . gm  and  successive  Lie  brackets 

is  contained  in  the  tangent  bundle  to  fV  D  30,  a  contradiction  since  this  bundle 
is  (p-l)-dimensional.  Thus  there  is  a  point  y€  fVndO  arbitrarily  close  to  *  with 
the  integral  curve  of  some  g,  1  <i  <m.  transversal  to  30  at  y,  and  i)  reduces  to 
ii). 

Suppose  that  condition  ii)  holds  at  *.  If  the  integral  curve  of  g,.  chosen 

arbitrarily  from  g, . gm  and  renumbered  if  necessary,  is  transversal  to  30  at  *. 

then  it  is  transversal  to  30  in  W  n30.  Following  the  integral  curves  of  g,  in  S 
that  begin  at  points  in  0  which  are  sufficiently  close  to  *.  and  continuing  past 
IVraO.  we  have  that  OcS  is  not  the  region  of  reachability  from  *0.  This  is 
true  since  0  is  reachable  from  *0. 

If  iii)  is  true  at  *.  then  the  arguments  given  in  ii)  with  /  replacing  g,  implies 
that  0  is  not  the  largest  reachable  region  from  *0.  □ 
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We  state  a  result  similar  to  the  necessary  part  of  Theorem  2.5  under  the 
local  assumption  of  a  C 1  boundary. 

Theorem  3.7.  Let  U  be  the  region  of  reachability  from  .t0  of  the  system  (2.2). 
Suppose  dU  is  a  C1  manifold  for  an  open  neighborhood  W  of  xSdU  in  M.  Then 
W C\bll  contains  the  integral  curves  of  g, ,...,gm  in  W  which  intersect  3 U.  and  f 
assigns  vectors  to  W r\bU  which  point  in  the  direction  of  U.  Moreover,  there  is  an 
open  dense  set  in  W  C\  bU  for  which  the  vectors  from  f  point  in  the  direction  of  U. 

Proof.  It  is  obvious  from  Theorem  3.6  that  we  need  only  prove  the  last 
statement  of  Theorem  3.7.  Suppose  there  is  an  open  set  V  in  IVndU  for  which 
the  vector  field  /  is  contained  in  the  tangent  bundle  to  V.  Then  the  bundle 

generated  by  /,  g, . gm  and  their  Lie  brackets  on  V  is  contained  in  the  tangent 

bundle  to  the  ( p  -  l)-dimensional  manifold  V.  This  contradicts  our  assumption 
that  the  vector  space  dimension  of  LA  is  p.  □ 

In  addition  to  our  hypothesis  that  the  dimension  of  LA  is  p,  suppose  that  we 
also  let  the  dimension  of  L'A  be  constant  on  M.  Given  any  point  .t6M,  Chow's 
Theorem  gives  us  a  C*  maximal  integral  manifold  through  x  with  UA  as  its 
tangent  bundle.  In  Theorem  3.7  we  would  have  that  36'  must  contain  these 
integral  manifolds  if  x £34/. 

The  following  theorem  from  [6]  will  allow  us  to  reduce  somewhat  the  C1 
assumption  of  Theorem  3.7.  The  statement  concerning  aC1  boundary  can  be 
relaxed  to  C1.  or  we  can  simply  replace  C1  in  our  preceding  results  by  C1. 

Theorem  3.8  [6].  Let  M  be  a  C*  manifold  of  dimension  n,  and  let  H  be  a 
subbundle  of  the  tangent  bundle  to  M  with  vector  space  dimension  n—  l.  Suppose 
U  c  M  is  an  open  set  with  the  property  that  if  O  C.U  is  an  open  set  having  a  C 2 
boundary  then  for  each  x  £  30  n  36  we  have  T,(dO)^  Hx  ( the  vector  space  of  H 
at  x).  Then  for  each  point  .r  £  3  U.  there  is  a  neighborhood  V  of  x.  a  real  valued- 
function  h€C*(V)  with  nonzero  differential  for  all  points  in  V,  and  a  closed 
nowhere  dense  set  £cH  such  that 

D  dunv~{xt=v\h(x)eE), 

2)  for  each  l  £  E.  S,  *  {.t  £  k']h(.'c)”  1}  is  an  integral  manifold  of  H.  i.e.  the 
boundary  of  U  is  foliated  by  integral  manifolds  of  H. 

Under  the  restriction  that  L\  is  a  bundle  we  need  no  differentiability 
restrictions  as  in  Theorem  3.7. 

Theorem  3.9.  Let  U  be  the  region  of  reachability  from  x0  of  the  system  (2.2).  If 
the  vector  space  dimension  of  L'A  is  the  constant  p'  <p  on  M.  then  3 U  contains  the 
C 30  p' -dimensional  integral  manifolds  (or  more  generally,  the  foliation  of  such 
manifolds)  of  the  biindie  L'A  that  intersect  3  U.  Also  the  vector  field  f  always  points 
in  the  direction  of  U  on  dU. 

Proof.  Let  x  be  any  point  in  36'.  There  is  an  open  neighborhood  W  of  x  in  M 
which  consists  of  a  C*  (n  -/>')-parameter  family  of  p'-dimensional  integral 
manifolds  of  L\.  Since  L'4cLA,  l+'nS  consists  of  a  C*  ( P -p')-parameter 
family  of  ^  -dimensional  integral  manifolds  of  L'A.  Take  an  arbitrary  C® 
l-parameter  family  of  the  p'-dimensional  integral  manifolds  the  union  of  which 
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contains  x  in  its  interior.  This  I-parameter  family  forms  a  C®  (p'  + 1)- 
dimensional  manifold  £.  containing  x,  and  L'A  is  a  p'-dimensional  subbundle  of 
its  tangent  bundle.  If  L  (or  an  open  neighborhood  of  .x  in  L)  is  contained  in  9 U, 
there  is  nothing  to  prove.  If  9 UrL  contains  an  open  set  in  L  and  x  is  a 
boundary  point  of  this  open  set,  then  we  simply  choose  another  L  so  that  this 
does  not  occur.  Otherwise,  we  apply  Theorem  3.8  with  U  replaced  by  (/  n  L,  9  C/ 
by  9 Un  L.  M  by  L,  and  H  by  L'A,  and  Theorem  3.7  (see  the  remarks  after  the 
proof  of  Theorem  3.7)  to  complete  the  proof  of_the  first  conclusion. 

Suppose  .xS9 U  and / does  not  point  towards  U jt  .x.  Moving  along  the  integral 
curve  of  /  starting  at  x  we  reach  a  point  in  U ,  the  complement  of  U  in  5. 
Starting  at  all  points  in  9 U  for  an  open  neighborhood  V  of  .x  in  9 U.  by 
continuous  dependence  on  initial  dam  and  uniqueness  of  integral  curves,  we 
find  that  we  can  reach  an_ppen  set  in  U  from  V.  It  remains  to  be  shown  that  we 
can  reach  an  open  set  in  U  from  U,  a  reachable  set.  Now  all  integral  curves  of  / 
which  pass  through  V  fill  up  an  open  set  in  5  containing  .x  in  its  closure.  Since 
the  intersection  of  this  open  set  with  U  contains  an  open  set  with  V  in  its 
boundary,  we  can  reach  an  open  subset  of  V  along  integral  curves  of  /  from  U. 
Hence  U  is  not  the  region  of  reachability  from  .x0,  a  contradiction.  □ 

Suppose  there  exist  no  sets  in  S  like  9 U  of  Theorem  3.9  which  disconnect  5 
in  the  appropriate  sense  (see  Corollary  4.3  in  [5]).  Then  our  system  is  5-control- 
lable  from  an  arbitrary  ,x06  5. 

If  LA—LA  (i.e.  p~p'),  then  by  part  i)  of  Theorem  3.6  the  system  (2.2)  is 
S-controllable  from  any  .x06  5.  If  p'-p-  1.  then  the  system  has  many  proper¬ 
ties  of  the  hypersurface  system  (2. 1 ). 

Theorem  3.10.  Let  U  be  the  region  of  reachability  of  the  system  (2.2)  from  ,x0  and 
assume  that  p'  —  p  —  1 .  Then  dU  is  an  integral  manifold  of  the  bundle  L'A  and  } 
assigns  vectors  on  dU  which  point  in  the  direction  of  U  on  dU  and  in  the  direction 
of  U  for  an  open  dense  subset  of  9  U. 

Proof  Since  p  -p-I,  we  have  by  Theorem  3.9  that  9 V  is  a  C®  integral 
manifold  of  L'A.  The  statement  concerning/ follows  from  Theorem  3.7  Q 

This  concludes  our  discussion  of  necessary  conditions  for  C®  manifolds  M. 
If  M  is  real  analytic  and  if  our  vector  fields  on  M  are  real  analytic,  then 
statements  of  the  Frobenius  Theorem  and  Chow’s  Theorem  exist  (see  [1])  which 
will  improve  some  of  our  results.  We  shall  not  go  into  this  matter  in  this  paper. 


IV.  Sufficient  Conditions 

Given  ,x06,l/  and  S  containing  .x0,  sufficient  conditions  for  an  open  set  O  cS  to 
be  the  region  of  reachability  from  x0  for  a  general  system  like  (2.2)  are  of 
interest  to  us.  One  such  result  in  the  literature  applies  to  the  system 

i(0 -yWO) +“(')*(*(  0).  *(0)-.xo6A/.  (4.1) 

Theorem  4.1  (4).  Let  L'A  be  the  Lie  algebra  generated  by  f  and g  in  (4.1),  and  let 
Lq  be  the  smallest  subalgebra  of  L'A  which  contains  g  and  is  closed  under  Lie 
bracketing  with  f.  Suppose  that  for  all  h  in  L0  we  have  [h,g]  —  ahg  for  some 
constant  aH.  Then  the  reachable  set  from  ,x0  is  equal  to 
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{expL0}(expr/)jc0. 

For  a  system  that  behaves  like  a  hypersurface  system  as  in  Theorem  3.10,  we 
have  the  following  theorem. 

Theorem  4-2.  Let  xQ €  M  and  let  O  be  an  open  subset  of  S  CM  which  contains  x0 
in  its  closure  and  which  is  reachable  from  x0  for  (2.2).  Suppose  that  L'A  is  a  vector 
bundle  with  dimension  p‘  ~p  —  1  on  Xf.  If  30  is  an  integral  manifold  of  the  bundle 
L\  and  if  f  points  in  the  direction  of  0  on  90.  then  O  is  the  region  of  reachability 
U  from  Xq. 

Proof.  Since  /  points  toward  0,  /  and  the  bundle  L\  must  be  linearly  indepen¬ 
dent  on  90.  If  O  is  not  the  region  of  reachability,  then  there  is  a  point  x  €  90 

and  a  neighborhood  W  of  x  in  S  which  can  be  reached  from  x0.  Let  Xt, _ Xf 

be  a  basis  for  LA  in  W.  Since  90  is  a  C"  integral  manifold  of  L\,  Lie  brackets 
of  the  form  [X„Xj],  1  <  ij  <  p'J^j,  yield  no  “new”  directions  in  which  to  move 
from  Wnd O  (i.e.  the  set  {X{,...,Xp}  is  involutive).  Because /,  Xx,...,Xp.  span 

T(S)  on  W,  brackets  like  [f,X,\,  i  —  1 . p'  give  us  vector  fields  on  W  which  are 

linear  combinations  of  /,  X . Xp.  The  same  is  also  true  for  successive  Lie 

brackets.  Not  being  able  to  control  the  drift  term  /,  the  only  linear  combinations 
which  arise  from  these  brackets  and  which  indicate  directions  rhat  we  can  move 
are  those  with  a  nonnegative  coefficient  for  the  /  term  (see  the  proof  of  Theorem 
4.3  in  which  a  more  general  case  is  considered).  Since  /  points  toward  O,  and  9 O 
is  an  integral  manifold  of  L'A,  we  are  unable  to  move  outside  of  O ,  and  in 
particular,  reach  W.  □ 

The  above  proof  follows  that  of  the  sufficient  conditions  for  the  hyper¬ 
surface  case  given  in  [5].  These  proofs  depend  on  the  fact  that  [f,X],  where 
X  6  La,  near  some  boundary  point  of  O.  must  be  a  linear  combination  of  /  and 
basis  elements  in  L'A.  There  are  no  “new”  directions,  given  to  us  by  some  Lie 
bracket,  that  we  can  move. 

The  following  interpretation  of  the  Lie  bracket  is  taken  from  [3].  Let  /  and  g 
be  vector  fields  on  a  manifold  Xt  and  let  x06  Xf.  Then  (/,gl  is  the  tangent  vector 
at  ,t0  to  the  curve  segment 

t  -*exp(  -VI  f) exp(  —  VI  g) exp ( \/7 /) exp ( V?  g)x0  (f>0). 

We  return  to  our  system  (2.2)  and  let  x0,M,S.LA,  and  L'A  be  as  before.  Our 
next  theorem  is  the  converse  statement  of  Theorem  3.9. 


Theorem  4  J.  Let  O  be  an  open  subset  of  S  cXt  containing  x0  in  its  closure  and 
which  is  reachable  from  x0.  Suppose  that  L\  is  a  vector  bundle  with  vector  space 
dimension  p  on  Xf  and  9 O  contains  the  C™  p' -dimensional  integral  manifolds  of 
L'a  that  intersect  it.  If  f  points  in  the  direction  of  O  on  90,  then  O  is  the  region  of 
reachability  from  x0  for  the  system  (2.2). 

_  .3. 

Proof.  If  we  are  to  reach  an  open  set  in  the  complement  of  O.  O,  then  we  must 
pass  through  90  at  some  point  .t€90.  It  is  obvious  that  an  open  subset  of  O 
cannot  be  reached  by  using  the  integral  curves  of  /  (in  positive  time)  or.the 
integral  curves  of  any  section  in  the  bundle  L\.  Moreover,  we  cannot  reach  O  by 
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using  a  linear  combination  of  these  with  a  nonnegative  coefficient  for  the  /(x(f)) 
term.  Hence  we  must  consider  Lie  brackets  of  the  form  [/,&],  1  <i  <m,  and 
higher  order  brackets  of  this  type. 

If  we  travel  along  an  integral  curve  of  /in  positive  time,  then  we  assume  that 
we  can  return  along  this  curve  until  the  starting  point  is  reached.  We  consider 
[f,g,]  at  x,  our  point  in  30,  for  some  i,  1  <  /  <m.  First  we  move  from  x  in  the  g, 
direction  for  V7  units  of  time,  and  we  remain  in  30  since  30  contains  the 
integral  manifolds  of  L\.  Next,  moving_along /  for  V7  units  of  time,  we  stay  in 
O  because  /points  in  the  direction  of  O  on  30.  If  /  left  us  in  30,  then  using  —g 
for  v7  units  of  time  keeps  us  in  30.  If  /left  us  in  O,  then  we  stay  there.  Finally, 
tgoving  along  -/  for  v7  units  of  lime  either  leaves  us  in  O  which  is  fine,  or  in 
O,  which  is  impossible  since  we  cannot  control  /.  Hence  we  can  start  an_ integral 
curve  at  x  in  the  direction  of  [/,&]  if  and  only  if  [/,&■]  points  towards  O.  This  is 

true  for  all  /  -  1 . m  and  also  for  [  g,,J]  at  x.  By  repeating  the  above  argument 

for  successive  Lie  brackets,  we  find  that  we  can  start  an  integral  curve  at  x  in  the 
direction  of  a.  particular  bracket  if  and  only  if  that  bracket  points  in  the 
direction  of  Oit  x. 

Thus  the  Lie  brackets  at  x  will  not  let  us  reach  an  open  set  in  O,  and  O  is 
the  region  of  reachability.  □ 

Suppose  that  we  knew  an  open  subset  of  5  could  be  reached  from  x0  which 
contains  t0  in  its  closure.  Let  U  be  the  smallest  open  subset  of  S  with  x0S  U  and 
satifying  the  hypotheses  of  Theorem  4.3.  By  Theorem  3.9  we  can  reach  U ,  and 
we  would  have  the  following  result. 

Conjecture  4.4.  Let  x0  €  M  and  let  U  be  the  smallest  open  subset  of  S  containing 
x0  in  its  closure  and  satisfying  the  hypotheses  of  Theorem  4.3.  Then  U  is  the  region 
of  reachability  from  ,t0  for  system  (2.2). 

In  the  conclusion  of  Theorejn  3.9  and  injhe  hypotheses  of  Theorem  4.3  we 
have  that  /  must  point  towards  O  on  3 O  (or  U  on  3 U).  Suppose  there  is  an  open 
neighborhood  \V  of  .tS3 O  so  that  all  integral  curves  of/  that  intersect  IFnSO 
actually  are  contained  in  WndO.  Since  3 O  contains  the  integral  manifolds  of 
L'A,  following  integral  curves  of  /  and  g,,  l</'<m.  in  IV  ndO  leaves  us  in 
W  n  30.  But  this  contradicts  the  fact  that  the  vector  space  dimension  of  LA  is  p 
and  Chow's  Theorem. 
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Abstract 

Suppose  M  la  a  connected  paracocipact  C“  n-d  linens  tonal  manifold  and  f,  g.  ,...,g 
are  complete  C“  vector  fields  on  M.  We  examine  the  system 

m 

*<t)  -  f(x(t))  +  l  u.(c)g,(x(c)).  x(0)-x.€K, 

1-1  1  1 

where  ^.....u^,  are  real-valued  controls.  Under  the  assumptions  Chat  the  Lie 
algebra  generated  by  f,  gj,...,gm  and  the  Lie  algebra  generated  by  g|.....gn  are 
vector  bundles  of  dimensions  p  and  p'  respectively,  with  p'  <  p,  we  characterise 
the  largest  open  subset  of  the  submanifold  S  of  M  (given  by  Chow's  Theorem) 
which  Is  reachable  from  Xq  for  our  system.  If  M  Is  a  real-analytic  manifold  and 
f.  gj,...,gn  are  complete  real-analytic  vector  fields,  then  the  assumption  that 
the  Lie  algebra  generated  by  f,  g^,...,gn  Is  a  vector  bundle  of  constant  dimen¬ 
sion  can  be  removed.  In  addition  the  requirement  that  the  Lie  algebra  generated 
by  go.  Is  a  vector  bundle  of  constant  dimension,  can  be  weakened. 


i.  rTsewcrw 

Lee  M  be  a  connected  paracompact  C”  n-dimensional 

manifold,  and  let  f,  g,  ,...,g  be  complete  C°*  vec- 
1  m 

tor  fields  defined  on  M.  If  u, ,...,u  are  real- 

1  m 

valued  controls,  we  are  Interested  In  character¬ 
izing  the  reachable  sec  of  the  system 
□ 

x(t)  »  f(x(t))  *  l  u  (t)g  (x(t)),  x(0)-x  €M 
1-1  1  1 

(1.1) 

We  assume  choc  the  Lie  algebra  L^  generated  by  f, 

g  ,...,g  and  their  Lie  brackets  is  a  vector  bun- 
1  ra 

die  of  dimension  p.  By  Chow's  Theorem  we  know 
tnat  the  reachable  set  is  contained  In  i  C“  p-dl- 
mens tonal  subrun  1  fold  S  of  M  which  Is  the  Integral 
manifold  of  L.  through  x„.  (1),(2)  For  the  real 
analytic  case  and  the  C”  case  we  know  chac  an  open 
subset  of  S  is  reachable . (8) , (7)  Krener's  proof 
Is  quite  nice  in  that  he  shows  chat  we  can  go  up 
one  dimension  at  a  cine  until  an  open  subset  of  S 


is  reached  This  nsrallels  ■''s  on-  a 5 - :r.i i r->  at  a 
time  proof  c#f  Greenfield  which  is  concerned  with 
holomorphlc  extension  theory  in  several  conplex 
variables  under  Che  assumption  of  constant  dimen¬ 
sion  of  the  Levi  algebra. (4)  Working  through 
Krener's  proof  we  find  chat  It  is  possible  co  reach 
an  open  subsec  of  S  which  Is  arbitrarily  close  to 

V 

The  largest  open  subset  U  of  S  which  Is  reachable 
fron  Xg  for  the  system  (1.1)  is  called  the  region 
of  reachability.  If  we  assume  that  the  Lie  algebra 
L'  generated  by  g  ,...,g  and  their  Lie  brackets  is 
a  vector  bundle  with  vector  space  dimension  p'  <  p, 

then  the  region  of  reachability  of  (1.1)  from  x. 

0 

is  a  connected  open  subset  of  S,  which  we  now  rail 
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the  domain  of  reachability.  Uelng  published  re¬ 
sults  we  show  thee  the  domain  of  reachability  U  is 
the  smallest  open  subset  of  S  containing  Xg  in  its 
closure  and  satisfying  the  following  conditions. 

(A)  The  boundary  of  U  contains  C“  p' -dimensional 
integral  manifolds  of  L^,  and  the  vector  field  f 
assigns  vectors  on  3U  which  point  In  the  direction 
of  If  ?'  *  p  we  know  that  we  reach  all  points 
lr.  S.  For  the  real-analytic  case  we  can  remove 
some  of  our  assumptions  on  the  dimensions  of  L, 

A 

and  L^.  (6) 

Section  2  of  this  paper,  contains  definitions,  and 
we  prove  our  result  concerning  the  domain  of  reach¬ 
ability  of  the  system  (1.1)  In  section  3.  In  sec¬ 
tion  4  we  assure  that  M  and  f,  g,,...,g  are  real- 
analytic  and  Improve  our  main  result,  Theorem  3.1, 
for  this  case. 

2.  DEFINITIONS 

Denote  by  TOO  the  tangent  bundle  to  our  n-dlnen- 

sional  manifold  M.  If  x€M,  then  T  (M)  is  the 

x 

tangent  space  In  T(M)  at  x.  For  X  a  vector  fleia 
on  M,  a  curve  i  is  an  Integral  curve  of  X  1 f  a  la 
a  Cx  mapping  from  a  closed  Interval  ICR  into  >1  so 
that 

-  X(a(t))  for  all  t«I. 
dt 

if  3  is  a  ijjse,.  o'  T,'.*i;  ,  then  an  integral  curve 

■i f  h  is  a  napping  l  from  real  Interval  [t,t’l 

into  11  sue!;  that  there  exist  t  *  t.<  t,  <  ...  '  t. 

0  1  k 

•  t’  and  sections  X.  ,...,X,  of  T(M)  In  0  with  the 
1  k 

restriction  of  "I  co  (t^_^,t^l  being  an  Integral 
curve  of  X^,  for  each  1  •  l,2,...,k.  For  Xg€M, 
a  point  x C ‘1  is  0- reachable  from  Xg  If  there  Is  an 
Integral  curve  3  of  D  and  some  T  '  0  In  the  Inter¬ 
val  for  3  such  that  a(0)  «  Xg  and  ix(T)  »  x.  A 
subset  A  of  M  Is  D-reachable  from  Xg  if  every  point 
:<  €  A  is  D-reachable  from  \g. 

In  ttie  remainder  of  this  article  the  set  D  of  in¬ 
terest  to  us  is  the  subset  of  T(M)  given  by  the 
svsten  (1.1).  Hence  we  drop  Che  D  in  the  above 
def  lnlt ions . 


field  f  on  H  points  in  the  direct  Ion  of  0  (the 
closure  of  0  in  S)  at  x  If  their  exists  an  open 
neighborhood  U  of  x  In  M  so  that  the  Integral  curve 
of  f  starting  at  x  and  Intersected  with  U  Is  con¬ 
tained  In  0.  If  this  occurs  for  every  point  x€30, 
then  f  points  in  the  direction  of  0  on  30. 


Civen  C  vector  fields  f  and  g  on  M,  the  Lie 
bracket  of  f  and  g  is  given  by 


2& 


where  denotes  the  Jacobian  matrix 


Of  course 

wc  can  take  successive  Lie  brackets  like  [f,if,g]), 
(g  •  [ f  ,8 1 1  •  etc. 


As  before  L  is  the  Lie  algebra  generated  by  f, 

A 

g,,...,g  from  (1.1)  and  all  successive  Lie 
1  m 

brackets.  If  L.  is  a  vector  bundle  with  vector 
A 

space  dimension  p,  Chow's  Theorem  together  with 
results  of  Krenar  show  that  we  can  reach  an  open 
subset  of  the  C”  p-dlmenslonal  submanifold  SCM 
through  Xg.(7)  The  largest -.open  subset  l>  of  S 
which  Is  reachable  for  (1.1)  from  xQ  is  the  region 
of  reachability  from  Xg.  If  U  •  S,  we  say  that 
the  system  Is  S-controllable  from  Xg. 

In  the  nexc  section  we  give  a  characterization  of 
the  region  of  reachability  of  our  syscem. 

3.  THE  DOMAIN  OK  REACHABILITY 

Our  objective  is  to  prove  tiie  following  result  for 

system  (1.1).  This  theorem  can  be  found  elsewhere 

for  the  case  m  •  n-l  and  f,  g,,...,g  .  linearly 

i  n-l 

Independent  on  M.(5) 

Theorem  3.1.  I  et  0  be  the  smallest  open  subset  of 
S  containing  Xg  In  its  closure  and  satisfying  the 
following  properties.  Suppose  that  the  Lie  algebra 
L  '  generated  by  e1t...,g  and  successive  Lie 

A  l  Itl 

brackets  is  a  vector  bundle  of  vector  space  dimen¬ 
sion  p'  <  p  on  M  and  30  contains  the  c”  p-dimen- 
sional  Integral  manifolds  of  L  ^  that  intersect  it. 
If  f  points  in  the  direction  of  0  on  30,  then  0  is 
the  region  of  reachability  from  x,  for  our  system. 

The  following  two  results  give  necessary  conditions 
and  sufficient  conditions  for  an  open  set  to  be  the 
region  of  reachability  from  Xg  for  (1.1). (6)  In 
both  theorems  we  assume  that  L  ^  has  constant  vector 


Let  S  be  a  p-diacr.s  tonal  submanifold  of  M,  let  3 
be  an  open  subset  or  S,  and  let  x€30.  A  vector 


space  distension  p'  <  p  on  M. 

Theorem  3.2.  Let  U  be  the  region  of  reachability 
from  Xq  of  our  system.  Than  3U  contains  the  C" 
p'-dlmansional  Integral  manifolds  of  L'  which  in- 

A 

tersect  it  and  f  points  in  tha  direction  of  U  on 

3U. 

Theorem  3.3.  Let  0  be  an  open  aubaet  of  SCM  con¬ 
taining  Xq  in  its  closure  and  which  Is  reachable 
from  Xq.  Suppose  90  contains  tha  C°  p' -dimensional 

Integral  manifolds  of  L!  which  intersect  it  and  f 
A 

points  in  the  direction  of  0  on  90.  Thun  0  Is  the 
region  of  reachability  -from  Xq  for  the  system 
(1.1). 

Proof  of  Theorem  3.1. 

As  mentioned  earlier  Krener’s  proof  shows  that 
arbitrarily  close  to  any  reachable  point  Is  a 
reachable  open  set  In  S.(7)  Thus  we  can  reach  an 
open  subset  V  of  S  which  contains  Xq  in  Its  clo¬ 
sure.  From  the  definition  of  reachable,  it  is 
obvious  that  we  can  choose  V  so  that  V  is  con¬ 
nected.  By  che  proof  oi  Theorem  3.2  we  can  assume 

that  8V  contains  the  Integral  manifolds  of  L' 

A 

which  intersect  It  and  f  points  towards  V  on  9V.(6) 

We  rlrst  show  that  V  is  connected.  Suppose  there 
exist  two  disconnu-ted  components  V  and  V,  of  V 
'■ith  a  point  xSdV^FloV^.  Then  the  unique  Inte¬ 
gral  manifold  N  of  L^  through  x  Is  contained  in 
">V|0  9V,  and  f  points  In  the  direction  of  Vj  and 
'.'2  on  this  manifold.  if  every  point  on  N  Is  an 
equilibrium  point  for  x  »  f(x(t)),  then  we  contra¬ 
dict  the  fact  that  the  vector  space  dimension  p1 
of  I '  is  less  rhan  the  vector  space  dimension  p  of 
L^  on  M.  Thus  we  nay  as  well  assume  that  die  dif¬ 
ferential  equation  x  »  f(x(t))  has  no  equilibrium 
points  on  ".  Each  solution  of  x  -  f(x(t>)  which 
st  irts  at  a  point  lr.  M  nest  remain  In  both  9Vj  and 
jV,  since  f  points  Cowards  Vj  und  on  9V^  and 
9V.,  respectively.  Through  each  point  of  3V^ 

we  have  an  Integral  manifold  of  L' .  Hence  ue  have 
u  subset  oi  cl^n  9V,  which  contains  both  Integral 
curves  of  f  and  integral  manifolds  of  L^.  It  Is 
Impossible  chat  the  vector  space  dimension  of  L^ 
is  p  on  this  subset,  a  contradiction.  Since  V  is 


connected,  it  follows  that  V  is  connected. 

Ry  Theorem  3.3  we  have  that  V  Is  the  region  of 
reachability  U  from  xQ  for  our  system.  It  remains 
to  show  that  In  fact  V  »  0.  Since  Xq€  9V 0 90  and 
90  satisfies  the  conditions  concerning  Lj^  and  f, 
the  argument  given  above  shows  it  Is  Impossible  to 
have  vno  -  0.  If  V  i  0,  then  there  Is  a  point 
x€9o0  9V,  and  a  repeat  of  the  seas  argument  with 
obvious  changes  Implies  a  contradiction.  Hence  0 
Is  Che  connected  region  of  reachability  U  of  (1.1) 
from  Xq.  Q.E.D. 

Theorem  3.1  completely  characterizes  the  region  of 
reachability  U  (or  more  correctly,  thu  domain  of 
reachability)  of  our  system  by  determining  condi¬ 
tions  that  must  be  satisfied  by  9U.  Of  course,  It 
would  be  nice  to  develop  a  computational  method  for 
finding  Integral  manifolds  of  subbundles  of  the 
tangenc  bundle.  Research  In  this  direction  Is  cur¬ 
rently  being  done  by  Michael  Freeman  in  the  case 
that  M  Is  a  real-analytic  manifold  and  i,  g,,..., 

L 

g  are  real-analytic  vector  fields  on  M.  For  this 
n 

reason  we  prove  an  Improved  version  of  Theorem  3.1 
In  the  real-analytic  case. 

4.  REAL-ANALYTIC  CONTROLLABILITY 
In  this  section  we  assume  that  M  and  f,  g,,...,g 

n 

are  real -jnnl ytlc.  Thus  we  can  upnlv  the  re.il- 
analytlc  version  of  Chow1*  Theorem  which  allows  us 
to  remove  the  assumption  that  the  Lie  algebra 
generated  by  f,  g,,...,g  and  tl-elr  Lie  brackets 

1  TTt 

Is  a  vector  bundle  of  dimension  o  on  M.(l),(3)  If 
this  Lie  algebra  has  dimension  p  at  x^,  then  there 
is  a  real-analytic  p -dimensional  submanifold  S  of 
M  through  Xq  which  contains  the  reachable  set. 

With  this  in  mind  we  state  and  prove  the  next  re¬ 
sult. 

Theorem  4 ■ 1 . 

If  0  is  the  smallest  open  subset  o:  S  containing  x() 
In  its  closure  such  chat  f  points  In  the  direction 
of  0  on  ho,  then  0  Is  the  region  of  reachability 
from  Xq  for  our  system  provided  one  of  the  following 
conditions  Is  satisfied. 

1)  The  Lie  algebra  L!  generated  bv  g  ,...,g 

A  1  a 

and  succes.-ive  Lie  brackets  is  a  vector 


la&ua  bite  icm-aiiaiyiiv.  w  ~u  luieiisiutiai  *u  -  ,  ,  .  ,  _  l 

For  an  example  we  consider  Che  system  in  R  , 

tegral  manifolds  of  L!  that  intersect  it.  ...  ,  ...  ^  _ 2 

A  x(t)  -  i(x(t))  u(c)g(x(t)),  x(0)  »  x  6  R  ,  (1.2) 

ii)  The  Lie  algebra  L!  generated  bv  g ...... g  ,  .  ,  ,  _2  0  _ 

A  ®1  ra  where  f  and  g  are  real -analy t ic  on  R  .  Suppose 

and  successive  Lie  brackets  has  dimension  .  ,  .  .  .2  . 

that  we  want  to  reacn  a  point  x  in  R  from  x_. 

'  p  at  x„  and  30  contains  the  reai-analvtlc  „ ,  .  .  0 

0  First  we  conpuCe  che  Lie  algebra  L,  generated  by 

integral  manifolds  of  L!  that  incersect  it.  ,  ,,  . 

n  f  and  g  at  x^.  If  the  vector  space  dimension  at 

Proof .  The  statement  regarding  1)  follows  directly  xQ  is  1,  there  is  an  1 -d inenstonal  integral  manl- 

from  Theorem  3.1.  fold  N  of  L  through  xn.  If  x  Is  not  in  this  in- 

A  0 

c  . .  Tt  .  ..  .  _  cegral  manifold,  there  is  no  hope  to  reach  it.  If 

Suppose  that  has  vector  space  dimension  p  at  *  K 

cv,  .  ,  ,  .  x  is  an  element  of  N,  then  we  must  check  for  equi- 

xQ.  bince  our  vector  fields  are  real  analytic,  .  M 

the  set  of  points  in  S  where  this  dimension  is  libriua  P°lnCs  °f  8  in  **  If  x0  ls  3n  W*librium 

less  than  p  are  nowher.  ;d«-.e  in  .S.  Let  S'  be  the  ?olnt'  tlu?n  U  dlvldes  N-lt,to  tuo  components  ^ 

largest  connected  open  component  of  S  which  con-  and  V  [f  f  polnts  ln  ti,e  dlrection  of  \  ac  *0 

tains  xQ  and  on  which  the  dimension  of  l;  Is  p.  and  x  €  s:  ‘’r  ,f  f  c’°,nt3  ln  che  dlructlan  •>'  *2 

By  known  results  we  have  that  we  can  reach  an  open  at  x0  and  X  €  S1 '  then  x  is  not  ln  the  reachable 

neighborhood  of  any  point  at  which  the  dimension  set  lr°r-  V  3o  we  assune  that  f  poincs  ln  the 

of  t;  is  p.(6)  Thus  S'  is  a  reachable  set  which  direction  of  ^  at  xQ  and  x€Nj.  Thus  we  must 

is  contained  in  the  region  of  reachability  of  our  check  aU  “<!Jiiibrlun  points  of  g  in  M,  between  *0 

system.  If  S'  *  S  we  have  nothing  to  prove,  and 

we  assume  that  jS’CS  is  nonempty.  Let  x  be  an 

arbitrary  point  in  3S*.  Since  the  dimension  of 

L^  at  x  is  less  than  p,  bv  che  real-analytic  ver 

sion  of  Chow's  theorem  there  ls  a  real  analytic  Suppose  that  the  dimension  'of  L^  at  :<0  is  2.  Then 

manifold  S  (which  ls  the  unique  integral  manifold  Chore  is  a  unique  integral  manifold,  which  we  also 

of  L^)  through  c  of  dimension  less  than  p.  3e-  call  N,  of  L  through  x^  that  contains  Cl.e  reachable 

cause  che  dimension  of  L^  is  constant  along  chis  set  of  our  system  from  .  We  assume  that  x^  and 

manifold,  S  cannot  move  inco  any  open  subset  of  S  x  are  net  equilibrium  poincs  of  j,  that  x£N,  and 

where  the  dimension  of  L^  is  p.  Thus  N  oust  re-  that  Che  equilibrium  points  of  g  do  not  separate  .1 

main  in  3S',  mj  3S'  consists  of  the  integral  man-  into  2  or  more  components  (remember  that  this  set 

(folds  of  L’(  that  intersect  it.  of  equilibrium  points  ls  nowhere  dense  in  ?!).  Thus 

we  can  delete  the  set  of  equilibrium  points  of  g, 

and  we  denote  by  1!'  the  remaining  subset  of  S.  This 

s< t  is  a  connected  real-analytic  manifold.  We 

have  Chat  the  dimension  of  l.',  the  Lie  algebra  con- 

A 

slstin*  of  just  R  itself,  is  1  on  S'.  By  Theorem 
3.1  the  problem  of  determining  if  x  Is  reachable 
from  Xq  reduces  to  the  problem  of  deciding  if  x  is 
in  the  region  of  reachability  from  x^  or  not.  This 
theorem  completely  characterizes  the  region  by 
giving  necessary  and  sufficient  conditions  on  Its 
boundary.  Hence  we  ask  if  there  is  an  integral 
curve  of  g  which  separates  M'  into  two  components, 
one  containing  x  and  the  other  containing  x^,  such 
that  f  points  In  the  direction  of  che  closure  or 


cannot  reach  x 


if  che  Integral  curve  of  f  starting  at  sone  point 
x  in  3S’  moves  into  che  complement  of  3',  the  same 
is  true  for  al!  points  in  3S'  near  x.  From  this 
we  conclude  that  an  open  3ubsec  of  the  complement 
of  S'  ran  be  reached  by  starting  at  points  ln  S' 
near  3S.  Thus  we  assune  that  f  poincs  ln  the  di¬ 
rection  of  S'  on  3S' ,  implying  S'  »  0,  Since  che 
integral  manifolds  of  L^  which  intersect  3S'  »  30 
are  contained  in  30  and  since  f  poincs  cowards  0 
on  30,  a  repeat  cl  the  arguments  given  elsewhere 
shows  that  0  is  the  region  of  reachability  from  x^ 
fir  our  sv,tem..(6)  In  this  case  x„  ls  actually  an 
interior  point  of  0  because  the  dimension  of  L  ' 


Cli*  component  containing  xg.  It  such  an  Integral 

curve  exlata,  we  cannot  reach  x  from  Xq.  Con¬ 
vene  lv  ,  If  such  an  Integral  curve  does  not  exist, 

Chen  x  Is  reachable  from  Xq. 
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Lee  M  be  i  connected  2"  real  n-dlmendonal 
manifold.  Clven  a  nonlinear  system  where  the 
controls  enter  linearly,  we  find  the  largest 
ooen  subset  of  M  (or  the  largest  open  subset  of 
a  -ubmanlfold  of  M)  which  is  reachable  from 
some  initial  poiat  Xg  t  M.  ..Assumptions  that 

certain  Lie  algebras,  which  are  generated  by 
the  vector  fields  of 'the  system,  form  vector 
subbundles  of  M  are  made  in  some  cases. 


Suppose  we  have  Che  nonlinear  system 
m 

x(t)  *  f(x(c)1  +  Z  u.it)g(x(t)),  x(0)  •  x-  t,  M, 
i»l 

where  M  is  n  connected  2*  real  n-dimensional 

manifold.  f.g . .  g  are  complete  _  vector 

1  m 

fields  on  M.  ar.d  u. , . . . ,  u-  are  real-valued  con¬ 
trols.  We  are  interested  in  characterizing  the 
reachable  set  in  M  of  this  system.  R.  W. 
Br.irkett’s  paper  contains  a  nice  discussion  of 

this  topic.' 

First  we  consider  the  hypersurface  case  in 
which  m  •  n-1  and  f,gj,...,  gn_j  are  linearly 

independent  on  M.  It  is  known  that  the  reachable 
set  of  a  hypersurface  system  contains  an  open  set 
in  M  The  largest  open  subset  of  M  which  is 
reachable  from  x.  is  called  the  region  of  reach¬ 
ability  L'  from  x?,  and  if  l'  •  M,  the  svstem  is 
vJ 

control  lable  from  x^,.  If  U  4  M  we  prove  that  the 

boundary  of  l'  is  a  T*  real  (n- 1  )-dimensiona  1 
submanifold  of  M  which  is  an  integral  manifold 

of  the  vector  fields  g, .  g  ,  and  that  the 

1  n- 1 

verier  Mold  f  points  in  the  direction  of  U  on 
*1 .  This  leads  to  a  result  which  gives  us  the 


region  of  reachability  of  che  hypersurface  system 
n-1 

x(t)  -  f(x(t))  +  Z  u  (t)g  (x(t)),  x(0)  -x  €  M. 

1-1 

Theorem  1 .^.  Suppose  U  is  the  smallest  open  sub¬ 
set  of  K  with  Xq  C  U  satisfying  3U  is  an  integral 

manifold  of  g,,...,  g  ,  and  f  assigns  vectors  on 
l  n-i 

3V  which  point  in  the  direction  of  l'.  Then  U  is 
the  region  of  reachability  from  xQ  for  our  hyper- 

surface  system. 

If  no  integral  manifolds  of  g  .....  g  as 

J  n-i 

given  in  the  theorem  exist,  then  the  hypersurface 
system  Is  controllable  from  any  Xq  ~  M. 

The  ideas  used  in  proving  the  above  result 
follow  those  found  In  the  solution  of  the  problem 
of  uniqueness  of  analytic  continuation  for  the  CR- 
functions  on  a  -•  *  real  hypersurface  in  Cn.  n  >  1. 

For  a  general  m,  1  <  m  <  n-1.  and  arbitrary 
C  vector  fields  f.g, ,.T. ,  ^  in  our  nonlinear 

system,  we  assume  that  the  Lie  algebra  generated 

bv  g  and  l»v  taking  successive  Lie 

I  m 

brackets  of  these  vector  fields  is  a  vector  bundle 
with  constant  fiber  dimension  p  on  )*.  By  Chow's 
Theorem  there  exists  a  maximal  ^  real  n-dimen¬ 
sional  submanifold  S  of  M  containing  x_  with  the 

U1  2 

generated  bundle  as  Its  tangent  bundle.  *  It  is 
known  thae  the  readable  set  from  x_  must  contain 
6  7  *' 

an  open  set  in  S.  *  The  largest  open  subset  U  of 
S  which  is  reachable  from  is  called  che  region 

of  reachability  from  x^.  If  0  is  an  open  subset 
of  S  which  Is  reachable  from  x^,  we  find  necessary 

conditions  and  suf f iciert  conditions  on  the  boundary 
of  t}  in  S  so  that  0  •  U.  Best  results  are  obtained 
when  it  is  assumed  that  the  Lie  algebra  L^  gener¬ 
ated  bv  g, , . . . ,  g  and  their  Lie  brackets  is  a 
1  m 

vector  bundle  on  ?!. 

4 

Theorem  2.  Let  U  he  the  region  of  reachability 
from  of  the  system 
m 

x(t)  -  f(x(t))+  7  u  (tig  (x(t)),  x(0)  •  x0  C  M. 
i-i  1 

If  the  fiber  dimension  of  L*  is  che  constant 

A 
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p'  <  p  on  M,  chan  3U  concalns  che  Zm  p ' -dimension¬ 
al  Integral  manifolds  of  the  bundle  Lj^  chat  lncer- 

secc  3U.  Also  the  vector  fields  f  always  point  in 
che  direction  of  U  on  3U. 

4 

Theorem  3.  Let  0  be  an  open  subset  of  $C  H  con¬ 
taining  Xq  in  its  closure  and  which  is  reachable 

from  Xq.  Suppose  chat  is  a  fiber  bundle  with 

fiber  dimension  p'  on  M  and  30  contains  the 
p’-dlaenslonal  lncegral  manifolds  of  L'  that  inter¬ 
sect  it.  If  f  polncs  in  the  dlrectlonof  0  on 
30,  then  0  is  the  region  of  reachability  from  Xq 

for  the  system 

m 

i(t)  -  f(x(c))  +  E  u.(t)g  (x(e)),  x(0)-x.  t  M. 

i-1  1  1  u 

Although  the  above  described  concepts  would 
appear  to  be  abstract,  in  some  cases,  especially 
for  a  hypersurface  system  on  a  low  dimensional 
mani'old,  they  are  readily  tested  via  standard 
analytic  techniques.  Here,  one  generates  the 
integral  manifolds  of  the  vector  fields  g^,  if  any 

such  manifolds  exist,  by  numerically  integrating 
the  equation  . 

n-1 

x(t)  -  :  u.(t)g.(x(t)). 

1-1  1  1 

Then  one  determines  the  direction  of  flow  by 
evaluating  f  along  these  manifolds.  If  no  separ¬ 
ating  Integral  manifold  admits  a  unidirectional 
flow  the  system  is  globally  controllable,  whereas 
if  such  a  manifold  exists  it  becomes  a  candidate 
for  che  boundary  of  che  reachable  sec  for  points 
on  one  side  of  che  separating  manifold.  In  Che 
case  of  a  general  nonlinear  system  with  linear 
inputs  one  muse  first  generate  Che  Lie  algebra 
of  che  g^  as  an  intermediary  step  to  che  computa¬ 
tion  of  che  lncegral  manifolds.  As  such,  Che 
above  described  controllability  conditions  are  noc 
as  readily  tesced  chough  chere  are  a  number  of 
special  cases  wherein  a  practical  controllability 
cesc  can  be  obcained.  Indeed,  we  have  investi¬ 
gated  a  number  of  examples  which  have  appeared  in 
the  llteracure  and  found  chac  in  each  case  our 
controllability  criterion  has  yielded  a  defini¬ 
tive  characterization  of  Che  reachable  secs  in 
state-space . 

Given  a  bundle  of  vector  fields  on  an  n-dl- 
menslonal  manifold  M,  current  research  involving 
computational  methods  of  finding  integral  mani¬ 
folds  of  che  bundle  is  of  obvious  interest  with 
regard  to  our  controllability  results. 
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11.  Abstract  of  "Global  Controllability  of  Nonlinear  Systems  in  Two  Dimensions", 
~  by  L.R.  Hunt. 

Let  M  be  a  connected  real-analytic  2-dimensional  manifold.  Consider 
the  system 

x(t)  =  f(x(t))  +  u(t)g(x(t) ) ,  x(0)  =  xQ  e  M, 

where  f  and  g  are  real -analytic  vector  fields  on  M  which  are  linearly  independent 
2 

at  some  point  of  ]R  ,  and  us  is  a  real-valued  control.  Sufficient  conditions  on 
the  vector  fields  f  and  g  are  given  so  that  the  system  is  controllable  from  Xg. 
Suppose  that  every  integral  curve  of  g  which  disconnects  M  has  a  point  where  f 
and  g  are  linearly  dependent,  g(p)  is  nonzero,  and  that  the  Lie  bracket  [f,g] 
and  g  are  linearly  independent  at  p.  Then  the  system  is  controllable  (with  the 
possible  exception  of  a  closed,  nowhere  dense  set  which  is  not  reachable)  from 
any  point  Xg  such  that  the  vector  space  dimension  of  the  Lie  algebra  LA  generated 
by  f,  g  and  successive  Lie  brackets  is  2  at  Xg.  This  is  a  generalization  of  the 
linear  theory  for  the  system 


x(t)  =  Ax(t)  +  u(t)8,  x(0)  =  Xg  e  1R^ 

in  that  the  Lie  bracket  of  Ax  and  B  is  the  constant  vector  field  AB.  Hence  if 
AB  and  B  are  linearly  independent  (i.e.  the  controllability  matrix  { B , AB }  has 
rank  2),  then  the  linear  system  is  controllable. 
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12.  Abstract  of  "Controllability  of  Nonlinear  Hypersurface  Systems"  by 
L.R.  Hunt. 


Consider  the  nonlinear  system 

n-1 

x(t)  =  f(x(t))  +  l  u.(t)g.(x(t)),x(o)  =  x  eM 
1-1  1  1  0 

where  M  is  a  connected  real-analytic  n-dimensional  manifold,  f.g^ . gn_^ 

are  real-analytic  Vector  fields  on  M,  and  Up...u  i  are  real-valued  controls. 
We  are  interested  in  characterizing  the  largest  open  subset  U  of  M,  if  any, 
which  is  reachable  from  xQ  and  which  we  call  the  region  of  reachability  of 
our  system  from  xQ.  If  the  Lie  algebra  L^  generated  by  f,g^,...,g  ^  and 
successive  Lie  brackets  has  vector  space  dimension  n  at  xQ,  and  if  f,g-|,..., 
gn_.|  are  linearly  independent  at  some  point  in  M,  we  find  the  region  of 
reachability  from  xQ.  Suppose  U  is  the  smallest  open  subset  of  M  with  xQeiJ 
so  that  3U  contains  the  integral  manifolds  of  the  Lie  algebra  L'A  generated 

by  g-|  _ _ »9n_i  that  intersect  it  and  f  assigns  vectors  on  3U  which  point  in 

the  direction  of  U.  Then  U  is  the  region  of  reachability  from  xQ  for  our 
system.  Much  of  the  work  is  involved  in  proving  a  similar  result  in  the  more 
general  C*  case  under  the  stronger  assumption  that  f,g^,..,g  are  linearly 
independent  on  the  connected  C™  n-dimensional  manifold  M. 
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13.  Abstract  of  "Controllability  and  Stability"  by  L.R.  Hunt. 


Consider  the  system 

x(t)  =  f ( x ( t ) )  +  u(t)g(x(t)) ,  x(0)  -  xQ  e]R2, 

where  f  and  g  are  real-analytic  vector  fields  on  ]R2.  If  this  is  a  controllable 
linear  system,  then  it  is  well  known  the  system  is  stablizable  by  linear  feed¬ 
back.  We  want  to  consider  a  similar  problem  for  nonlinear  systems,  with  em¬ 
phasis  on  bilinear  systems.  Sufficient  conditions  for  the  above  system  to  be 
controllable  have  been  found,  and  implementation  for  bilinear  systems  has  been 

discussed.  If  a  bilinear  system  is  controllable  under  these  conditions,  we 

2 

show  that  we  can  move  from  any  point  xQ  e  R  -  {(0,0)}  to  the  origin. 
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6.  Summary: 

The  goal  of  the  proposed  work  unit  is  to  develop  computationally  efficient 
fault  analysis  algorithms  which  are  compatible  with  the  dual  mode  ATPG/FDS 
software  structure  typically  employed  in  a  fault  diagnosis  system.  We  have 
previously  developed  several  such  algorithms  applicable  to  linear  systems,  one 
of  which  is  presently  being  implemented  at  the  Naval  Ocean  System  Center.-  This 
work  is  represented  in  this  report  by  several  reprints  and  also  serves  as  the 
foundation  for  the  on-going  research  on  nonlinear  fault  analysis. 

We  have  investigated  three  alternative  approaches  to  the  nonlinear  fault 
analysis  problem.  These  include  a  nonlinear  state  space  approach,  an  approach 
which  employs  nonlinear  integral  performances  measures  in  lieu  of  frequency 
domain  information,  and  an  approach  in  which  an  affine  approximation  of  a  linear 
system  is  employed.  During  the  past  year  our  major  activity  has  been  in  the 
state  space  area  in  which  we  have  two  on-going  activities.  Both  of  these  assume 
an  augmented  state  model 

i  «  f(x,r)  1. 

r  =  0 
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where  x  is  the  state  vector  for  the  system  and  r  is  the  vector  of  component 
parameters  which  we  must  identify  to  diagnose  a  failure.  One  then  measures 
a  subvector  for  x  or  an  output  process  y  =  g(x,r)  and  applies  some  type  of 
system  identification  process  to  estimate  r.  To  this  end  we  have  investigated 
the  possibility  of  employing  nonlinear  observers  and  several  possible  quasi¬ 
linearization  algorithms.  The  former  approach  was  reported  in  a  conference 
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In  addition  to  the  nonlinear  state  space  approach  to  the  fault  analysis 
problem,  we  have  recently  reopened  an  earlier  investigation  into  the  feasi¬ 
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software  structure  employed  in  linear  analog  and  digital  fault  analysis.  At 
the  present  time  a  master's  thesis  in  which  the  viability  of  the  approach 
will  be  investigated  is  in  preparation,  though  we  do  not  yet  have  definitive 
resul ts . 
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Fault  Diagnosis  for  Linear  Systems  Via 
Multifrequency  Measurements 

NEERAJ  SEN,  member,  ieee,  and  RICHARD  SAEKS,  fellow,  ieee 


Abstract — Th*  fault  diagnosis  problem  for  *  linear  tv  stem  wtuxse  trans¬ 
fer  function  matrix  is  measured  at  a  discrete  set  of  frequencies  b  for- 
maJlresL  A  measure  of  solvability  for  the  resultant  equations  and  a 
measure  of  testability  for  dm  unit  under  test  b  developed.  These,  in  turn, 
are  used  as  dm  basis  of  algorithms  for  churning  test  points  and  test 
frequencies. 

I.  Introduction 

CONCEPTUALLY,  the  fault  analysis  problem  for  an 
analog  circuit  or  system  amounts  to  the  measure¬ 
ment  of  a  set  of  externally  accessible  parameters  of  the 
system  from  which  one  desires  to  determine  the  internal 
system  parameters  or  equivalently  locate  the  failed  com- 
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Fig  I.  Conceptual  model  of  fault  diagnosis  problem. 

ponents  as  illustrated  in  Fig.  1 .  Here,  the  measurements  mi 
may  represent  data  taken  at  distinct  test  points  or  alterna¬ 
tively,  data  taken  at  a  fixed  test  point  under  different 
stimuli.  Similarly,  the  rt  represent  parameters  characteriz¬ 
ing  the  various  internal  system  components.  Here,  a  single 
parameter  may  characterize  an  entire  component,  say  a 
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resistance,  capacitance  or  inductance.  Alternatively,  a 
component  may  be  represented  by  several  parameters:  the 
h  parameters  of  a  transistor,  the  poles  and  gain  of  an 
op-amp,  etc.  In  general,  one  models  a  system  component 
by  the  minimum  number  of  parameters  which  will  allow 
the  failure  to  be  isolated  up  to  a  shop  replaceable 
assembly  (SRA)  with  all  “allowed"  system  failures  mani¬ 
festing  themselves  in  the  form  of  some  parameter  change. 

To  solve  the  fault  diagnosis  problem,  one  then  measures 
m-col  (m,)  and  solves  a  nonlinear  algebraic  equation 

m  -  F{r)  (1) 

for  r  =  col  (r,)  to  diagnose  the  fault.  The  parameters  in  the 
resultant  r  vector  which  are  out  of  tolerance  then  indicate 
the  faulty  component  [6]. 

The  purpose  of  the  present  paper  is  to  give  an  explicit 
formulation  of  the  fault  diagnosis  Equations  which  arise  in 
the  maintenance  of  linear  systems.  Here,  one  measures  the 
system  frequency  response  as  observed  from  a  specified 
set  of  externally  accessible  test  points  at  a  discrete  set  of 
frequencies  and  it  is  desired  to  solve  for  a  vector  of 
internal  system  parameters  r  which  completely  char¬ 
acterize  the  frequency  response  matrices  of  the  individual 
system  components;  Z,(s,r),  i  — 1,2,-  •  •  ,q. 

In  the  following  section  the  explicit  form  for  the  fault 
diagnosis  equations  is  derived  for  a  given  set  of  test 
frequencies.  A  measure  of  solvability  [15]  of  these  equa¬ 
tions  is  then  developed  in  Section  III  and  employed  in 
Section  IV  in  an  algorithm  for  optimally  selecting  test 
frequencies.  The  measure  of  solvability  for  the  fault  analy¬ 
sis  equations,  given  an  optimal  choice  of  test  frequencies, 
is  then  taken  as  a  measure  of  testability  [1],  [2],  [5]  for  the 
unit  under  test  (UUT)  and  is  used  as  the  basis  of  an 
algorithm  for  the  optical  choice  of  test  points  [3]— [5]. 
Finally,  a  number  of  examples  are  presented  in  Section  V. 

II.  Expucit  Form  of  the  Fault  Diagnosis 
Equations 

In  the  case  of  a  linear  time-invariant  circuit  or  system, 
the  fault  diagnosis  equations  may  be  expressed  in  analyti¬ 
cal  form  [6],  Since  the  fault  diagnosis  equations  deal  with 
the  relationship  between  the  externally  measurable  system 
parameters  m  and  the  internal  component  parameters  r 
we  adopt  a  component  connection  model  as  the  starting 
point  for  the  derivation  of  the  fault  diagnosis  equations 
[T\,  [8].  This  is  one  of  several  commonly  employed  large 
scale  system  models  in  which  the  components  and  con¬ 
nections  in  a  circuit  or  system  are  modeled  by  distinct 
equations,  thereby  permitting  one  to  explicitally  deal  with 
the  relationship  between  the  individual  component 
parameters  and  the  composite  system  parameters. 

Since  the  present  study  is  restricted  to  linear  time-in¬ 
variant  systems,  we  assume  that  each  component  is  char¬ 
acterized  by  a  transfer  function  matrix  which  is  dependent 
on  the  potentially  variable  component  parameters,  Zfs,r). 
For  the  classical  RLC  components  Z,{s,r)  may  take  the 
form  R ,  Ls,  or  1/rC  for  the  case  of  a  resistor,  inductor,  or 


capacitor,  respectively.  More  generally,  one  may  mode!  an 
op-amp  by  the  transfer  function  k/is-p^fs-p?)  where 
the  parameter  vector  r  now  represents  the  three  poten¬ 
tially  variable  component  parameters;  k,pltp2'-  or  a  delay 
by  ke,T,  etc.  Although  the  symbol  Z  is  used,  the  compo¬ 
nents  are  not  assumed  to  be  represented  by  impedance 
matrices.  Indeed,  hybrid  models  are  used  in  most  of  our 
examples.  For  the  purpose  of  analysis,  it  is  assumed  that 
all  faults  manifest  themselves  in  the  form  of  changes, 
possibly  catastrophic,  in  the  parameter  vector  r  with  the 
frequency  characteristics  of  the  components  unchanged. 
Although  not  universal,  this  fault  hypothesis  covers  the 
most  commonly  encountered  situations  and  subsumes  the 
common  industrial  practice  of  assuming  that  all  failures  in 
analog  circuits  and  systems  take  the  form  of  open  and 
short  circuited  components  [9]. 

Our  system  components  are  thus  characterized  by  a  set 
of  simultaneous  equations 

A,  -  Z,(s,r)a„  I-1.2,---,*  (2) 

where  a,  and  b,  denote  the  component  input  and  output 
vectors,  respectively.  For  notational  brevity,  these  compo¬ 
nent  equations  may  be  combined  into  a  single  block 
diagonal  matrix  equation 

b  “  Z(s,r)a  (3) 

where  b  =  col  (b;),  a* col  (a,),  and  Z(s,r)  =  diag  (Z,(s,r)). 

Although  there  are  many  ways  to  represent  the  connec¬ 
tion  in  a  circuit  or  system;  say,  a  block  diagram,  linear 
graph  or  signal  flow  graph,  any  such  representation  is 
simply  a  graphical  means  for  displaying  a  set  of  connec¬ 
tion  equations:  Kirchboff  laws,  adder  equations,  etc.  As 
such,  for  our  component  connection  model  we  adopt  a 
purely  algebraic  connection  model  in  which  the  connec¬ 
tion  equations  are  displayed  explicitally  without  the  inter¬ 
mediary  of  some  kind  of  graphical  connection  diagram. 
This  takes  the  form 

a*  L,,b  +  L12u 

y^Ljib  +  L^u  (4) 

where  u  and  y  represent  the  vectors  of  accessible  inputs 
and  outputs  which  are  available  to  the  test  system.  In 
simple  systems,  the  connection  matrices  L,,  are  usually 
obtainable  by  inspection,  whereas,  in  more  complex  sys¬ 
tems.  computer  codes  have  been  developed  for  their  de¬ 
rivation  [7].  Moreover,  they  are  assured  to  exist  in  all  but 
the  most  pathological  systems  [8]. 

It  is  the  pair  of  simultaneous  matrix  equations  (3)  and 
(4)  which  are  termed  the  component  connection  model. 
By  combining  (3)  and  (4)  to  eliminate  the  component 
input  and  output  variables  a  and  b  one  may  derive  [6],  [7] 
an  expression  for  the  transfer  function  matrix  observable 
by  the  test  system  between  the  test  input  and  output 
vectors  u  and  y  obtaining 

5(j,r)-L3J  +  LJI(I-Z(s,r)2.ll)",Z(s,r)L1J  (5) 

where 

(6) 


y-S(s,r)u. 
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For  a  linear  time-invariant  system  the  transfer  function 
S(s,r)  is  a  complete  description  of  the  measurable  data 
about  the  unit  under  test  available  to  the  test  system. 
Moreover,  being  rational  it  is  completely  determined  by 
its  value  at  a  finite  number  of  frequencies.  As  such, 
without  loss  of  generality,  we  may  take  our  measured  data 
to  be  of  the  form 

col  [5(j„r)f 5(j2,/-),' •  •  ,S(sk,r)].  (7) 

The  fault  diagnosis  equations  then  take  the  form 

S(.jvr)  ^<22+  f«2|(l  —  Z(sfr)Ln)  Z(jltr)L12 
S(s2,r)  !«  +  Lj,(l  -  Z(s1,r)L1I)',Z(s„r)LI2 

S(sk,r)  £^2+ i<u(l  —  Z(jt,r)Z,,,)  Z(sk,r)Ll2 

Since  S(s,r)  is,  in  general,  a  matrix,  the  fault  diagnosis 
equations  as  derived  above  take  the  form  of  a  matrix  (col 
[S(i„r)D  valued  function  of  a  vector  valued  variable  r. 
Computationally,  however,  we  prefer  to  work  with  a  vec¬ 
tor  valued  function  of  a  vector  valued  variable  and  hence, 
we  transform  S(s,r)  into  a  column  vector  via 

vec  [5(s,r)]  »col  [S'(s,r)]  (9) 

where  S‘(s,r )  denotes  the  ith  column  of  the  matrix,  S(s,r). 
With  the  aid  of  the  identity  vccfXYZJ—lZ'® X]vcc  [Y] 
(8)  then  transforms  into  [7],  [12] 


tal  questions  remain  to  be  answered:  “What  test  frequen¬ 
cies  should  be  employed  to  optimize  the  solvability  of  the 
equations?”  and  “How  solvable  are  the  equations  given  an 
optimal  choice  of  test  frequencies?"  Both  of  these  ques¬ 
tions,  in  turn,  hinge  on  the  development  of  some  type  of 
measure  of  solvability  [15]  for  the  fault  diagnosis  equa¬ 
tions. 

For  a  set  of  linear  equations 

m*  Fr  (11) 

where  r  is  an  n  vector,  m  is  a  p  vector,  and  F  is  a  p  by  m 
matrix  one  may  characterize  the  solvability  of  the  equa¬ 
tions  in  terms  of  the  number  of  arbitrary  parameters  in  its 
solution  (if  a  solution  exists).  As  such,  5*  n  —  raok(F)  is  a 
natural  measure  of  the  solvability  for  (11).  Here,  5*0 
implies  that  the  equation  has  a  unique  solution,  5*1 
implies  that  the  solution  is  determined  up  to  one  arbitrary 
parameter  and  so  on,  with  increasing  values  of  5  repre¬ 
senting  decreasing  degrees  of  solvability. 

The  fault  diagnosis  equations  are,  however,  nonlinear 
even  for  linear  systems — hence  we  must  resort  to  the 
implicit  function  theorem  to  obtain  a  measure  of  solvability 
[15],  [16]  analogous  to  the  above  [13].  Indeed,  if  /y  is  a 
solution  to  the  fault  diagnosis  equations,  then  rf  is  de¬ 
termined  up  to  a 

S(rf)  -  n  -  rank  ~  (rf)  ( 12) 


vec  [S(r,,r)] 
vec  [S(rj,r)] 

M  * 

vec  [5(rk,r)] 

which  is  the  form  of  the  fault  diagnosis  equations  with  dimensional  manifold  (of  arbitrary  parameters)  in  a 
which  we  desire  to  work.  neighborhood  of  r,.  Here  dF/dr  is  the  Jacobian  matrix  of 

partial  derivatives  of  F  with  respect  to  r.  With  the  aid  of 
III.  Solvability  of  the  Fault  Diagnosis  matrix  identity  d{\r~l)/dr-  -  M~'[dM/dr\M 

Equations  dF/dr  can  be  computed  explicitly  from  (8)  and  (10) 

For  the  fault  diagnosis  equations  derived  above  to  be  a  yielding 
viable  tool  of  circuit  and  system  diagnosis  two  fundamen- 


*  vec  [  L22] +  [ ^12®  2^,(1  —  Z(j,,z)L||)  jvec  [ Z(r,,r)] 

*  vec  [ Ljj]  +  [ f-2i(l  —  Z(j2, /•)£,,)  jvec  [Z(s2,r)] 

L  •  -F(r)  (10) 

-  vec  [L2J]  +  [L,'a®LI1(l-Z(s4r)£.n)'']vec  [Z(sk,r)] 
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where  denotes  matrix  transposition  and  ®  denotes  the 
matrix  Kronecker  (or  tensor)  prc>duct. 

The  difficulty  with  the  implicit  function  theorem  is  that 
it  only  yields  local  information  valid  in  a  neighborhood  of 
a  solution.  Fortunately,  however,  given  the  special  nature 
of  the  Jacobian  matrix  of  (13)  coupled  with  an  assumption 
that  the  component  transfer  function  matrices  Zfs,r)  are 
rational  in  r,  it  is  possible  to  show  that  the  rank  of  the 
Jacobian  matrix  is  “almost  constant.”  This,  in  turn,  allows 
us  to  transform  the  local  measure  of  solvability  of  (12) 
into  a  global  measure  of  solvability.  For  this  purpose  we 
adopt  the  algebraic  geometric  definition  for  the  terra  “al¬ 
most  constant;”  i.e.,  we  say  that  a  function  of  rj  is  almost 
constant  if  it  is  constant  except  possibly  for  those  values  of 
r}  lying  in  an  algebraic  variety  (the  solution  space  of  a 
finite  set  of  nonzero  simultaneous  polynomial  equations  in 
n  variables).  More  generally,  we  say  that  a  property  holds 
“almost  everywhere”  or  for  almost  all  in  n  space  if  it  is 
true  for  all  values  of  rj  except  possibly  those  lying  in  an 
algebraic  variety.  Since  the  Lebesque  measure  of  an  alge¬ 
braic  variety  is  zero,  this  definition  for  the  concept  “al¬ 
most  everywhere”  is  consistent  with  the  more  common 
measure  theoretic  definition  and  is  more  natural  in  the 
context  of  our  application  [14]. 

Theorem  1 

Let  Z,(r,r);  i - 1,2,- •  •  ,q;  be  rational  in  r.  Then  S(ry)  is 
almost  constant. 

Note,  the  assumption  that  Z,(s,r)  is  rational  in  r  is  quite 
minor  being  satisfied  by  all  of  the  examples  given  in 
Section  II  except  for  the  delay  (which  can  be  approxi¬ 
mated  by  a  function  which  is  rational  in  r).  In  practice, 
the  component  transfer  function  matrices  will  also  be 
rational  in  s  though  this  is  not  required  for  the  present 
theorem  since  F  and  dF / dr  are  formulated  in  terms  of 
specific  test  frequencies,  svs2r  •  •  ,sk.  Given  our  assump¬ 
tion  on  the  Zfs,r),  together  with  (13),  it  then  follows  that 
(dF) / (dr)(rf)  is  also  rational  in  r}. 

Proof  of  Theorem  1:  We  begin  by'  showing  that  an 
arbitrary  polynomial  matrix  in  r,  P{r),  has  almost  con¬ 
stant  rank.  Since  rank  P{r)  is  restricted  to  the  finite  set  of 
integers  (0. 1, 2.  •  •  •  J;  where  j  is  the  minimum  of  the 
number  of  rows  and  columns  in  Pfr),  there  exists  an  rm 
which  maximizes  the  rank  of  P(r) 

rank  [  P{rm)  ]  >  rank  [  /»(/■)].  (14) 

Now,  the  rank  of  a  matrix  is  the  dimension  of  its  largest 
nonsingular  square  submatrix.  As  such,  P(r)  admits  a 
square  submatrix  .V/(r),  whose  dimension  is  equal  to  the 
rank  P(rm)  and  for  which 

det  M(r J+0.  (15) 

Now,  det  [M(r)\  is  a  polynomial  in  r  which  is  not  identi¬ 
cally  zero  (from  (15))  and  hence,  it  is  nonzero,  almost 
everywhere.  As  such. 


showing  that  rank  [F(r)]  =  rank  [P(rm)\  almost  every¬ 
where.  As  such,  rank  (F(r)]  is  almost  constant. 

Now,  to  verify  that  rank  [(dF)/(dr)(rj)]  is  constant  we 
decompose  this  matrix  as 


dF 

dr 


('/)" 


nr,) 

nr,) 


(17) 


where  P{rj)  is  a  polynomial  matrix  and  d(rf)  is  a  nonzero 
common  denominator.  P(rf)  has  almost  constant  rank 
while  d(rj)  is  nonzero  almost  everywhere  and  hence  can 
effect  the  rank  of  P(rf)  only  on  an  algebraic  variety  (since 
the  division  of  a  matrix  by  a  nonzero  scalar  does  not 
effect  its  rank.)  As  such,  our  Jacobian  matrix  has  almost 
constant  rank  implying  that 


S(r,) 


n  —  rank 


dF. 

-+l'i> 


(H). 


is  also  almost  constant  The  proof  of  the  theorem  is 
therefore  complete. 

Given  the  theorem,  we  may  now  define  a  global  measure 
of  solvability  for  the  fault  diagnosis  equation  8  as  the 
generic  value  of  Sfrf)-,  i.e.,  the  value  8(rj)  takes  on  for 
almost  all  r}.  This  proves  to  be  a  natural  measure  of 
solvability  since  it  indicates  the  ambiguity  which  will 
result  from  an  attempt  to  solve  the  fault  diagnosis  equa¬ 
tions  in  a  neighborhood  of  almost  any  failures.  Of  course, 
one  requires  some  sort  of  equation  solving  algorithm  [10], 
[11]  to  locate  a  neighborhood  of  an  actual  failure.  The  5 
parameter,  however,  represents  a  bound  on  the  perfor¬ 
mance  of  any  such  algorith.  Finally,  we  note  that  since  5 
is  independent  of  t f,  the  solution  of  the  fault  diagnosis 
equations,  it  can  be  computed  at  the  time  the  system  and 
its  test  algorithm  are  developed  by  evaluating  5(r)  at  a 
randomly  chosen  generic  point,  say  r0.  In  turn,  this 
parameter  may  then  be  employed  as  an  aid  in  the  choice 
of  test  frequencies  and  test  points. 

IV.  Test  Frequency  Selection 

Adopting  the  measure  of  solvability  5  formulated  in  the 
preceeding  section,  it  remains  to  develop  an  algorithm  for 
choosing  a  set  of  test  frequencies;  s,,r2,  •••  ,sk;  which 
maximize  the  solvability  of  the  fault  diagnosis  equations 
(i.e.,  minimize  5).  To  this  end,  let  denote  the  mini- 
mum  value  achieved  by  5  for  any  set  of  test  frequencies; 
jj.jj,*  •  •  ,j*;  A-1,2,--*,.  Since  the  possible  values  for  5 
are  restricted  to  the  finite  set;  fi  — 0, !,••  •  ,n;  such  a 
minimum  is  assured  to  exist 

The  following  theorem  gives  an  explicit  formula  for 
computing  8mr,  while  its  proof  yields  an  algorithm  for 
choosing  a  set  of  test  points  which  achieve  5^.  Since  the 
purpose  of  this  theorem  is  to  formulate  an  algorithm  for 
choosing  lest  frequencies,  the  theorem  is  expressed  in 
terms  of 


rank  [F(rjj  >rank  [F(r)j  >  rank  [  A/(r)] 
-rank  [F(0]  a.*- 


vec  [S(s,r)]-vec  [  +  [Z.{2] 

(16)  ®Lll(l-Z(r,r)I.ll)_,]Vec[Z(s,r)]  (19) 
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and 

„  {([  J  +  iu(  J  _ Zis.r)Lt^Z{s.r)]Luy 

®(Z^,(l  -  Z(T.r)Z.„)“')} 

•{dvec  [Z(s,r) ]/dr}  (20) 

viewed  3S  rational  functions  in  s  rather  than  in  terms  of 
the  function  F(r)  which  is  formulated  in  terms  of  an  a 
priori  choice  of  test  frequencies. 

Theorem  2 

Let  Z'^s.r);  i  ■  1,2,*  •  •  ,q\  be  rational  in  s  and  r.  Then 

'  dve c  f  S(r,r)l 
-col-rank  - ± 


where  n  is  the  dimension  of  the  parameter  vector  r  and 
“col-rank”  denotes  the  generic  number  of  linearly  inde¬ 
pendent  columns  of  the  rational  matrix  [dvec  \S(s,r)\/ dr\ 
over ‘the  field  of  complex  numbers.  Moreover,  3m.„  is 

achieved  by  almost  any  choice  of  n  —  S _ distinct  complex 

frequencies. 

Proof:  For  the  sake  of  brevity,  we  will  prove  the  theo¬ 
rem  only  for  the  special  case  where  S(s,r)  is  a  scalar 
transfer  function  (allowing  us  to  drop  the  “vec”  transfor¬ 
mation)  though  essentially  the  same  proof  goes  through  in 
the  general  case  modulo  some  notations!  complexities  [5]. 
Also,  since  the  rank  of  the  Jacobian  matrix  is  almost 
constant  it  suffices  to  fix  the  parameter  vector  r  at  any 
generic  point,  say  rg.  This  then  reduces  [dvec  [S(s,r)\/  dr] 
to  a  row  vector  of  rational  functions 


*(s) -[*,(*) 


*»(*)  ]  (21) 


K,(r)-[dvec  [S(r, rg)] /dr,]  (22) 

and  our  problem  reduces  to  the  verifications  of  the  fact 
that  the  number  of  linearly  independent  columns  of  R(s ) 
over  the  field  of  complex  scalars  is  equal  to  the  maximum 
possible  rank  of  the  complex  matrix 


^i(Jt)  ]  Ri(s\) 


R(*)  ^l(Jt)  |  ^l(Sk) 


;  *,('.) ' 

1  *.*■ 

1  AM 

■ _ _ 

■ 

;  *js>) 

■coi  (R(s,))  (23) 


for  all  r.  Then  by  applying  124)  indivrJxrly  (or  each  r( 

col(f?,(r,))=  Y^col(^U))  (25) 

1- 1 

for  any  possible  number  or  choice  of  the  s,.  The  rank  of 
the  matrix  of  (23),  therefore,  is  less  than  or  equal  to  the 
number  of  linearly  independent  columns  of  R(s)  over  the 
field  of  complex  numbers. 

To  prove  that  equality  can  be  achieved  with  an  ap¬ 
propriate  choice  of  n  —  complex  test  frequencies  r,  we 
invoke  our  assumption  that  S(s,r )  is  a  scalar  transfer 
function.  Without  loss  of  generality,  we  may  assume  that 
f?,(s)  through  Rq(s)  are  the  linearly  independent  entries  in 
R(s)  over  the  field  of  complex  numbers  in  which  case  we 
must  show  that  there  exists  complex  frequencies 
•  •  ,sk  (Jfc  —  q  in  this  case)  which  make  the  first  q 
columns  of  the  matrix  of  (23)  linearly  independent. 

If  1,  Rfs)  is  not  identically  zero  (since  otherwise  it 
would  be  linearly  dependent)  and  hence  for  almost  all  s,, 
As  such,  the  columns  in  this  trivial  one  by  one 
matrix  are  linearly  independent.  With  this  as  a  starting 
point,  we  will  use  an  inductive  argument  to  show  that  the 
theorem  holds  for  all  values  of  q.  We,  therefore,  assume 
that  it  has  been  shown  that  for  q=p  there  exist  complex 
frequencies;  ,r,;  such  that  the  matrix 

*i(4i)  Ri(*  t)  VJi) 


[*■(*)  W  •••  w J 

has  linearly  independent  columns  and  we  desire  to  show 
that  there  exists  an  r?  +  ,  such  that  the  matrix 


VtW1 


^l(Jt)  ^2(^1) 
Ri(si)  ^z(Jz) 


R,(S)  /*j(4) 


^r(Jl) 

R,(*d  Rfrl(Sz) 

R?(sp)  Rp+i(sf) 

(27) 


over  ail  possible  choices  of  the  complex  frequencies; 

•••  ,rt;  k~  1,2, •  •  •  .  Now,  clearly  if  some  column  of 
R(s),  say  die  nth,  is  dependent  on  the  remaining  columns, 
then 

A.W- *2  $*/(')  <24) 


has  linearly  independent  columns  for  r«r^+1.  By  virture 
of  our  assumption  that  S(s,r)  is  a  scalar  both  R?  and 
f^  +  )(s)  are  square  and  we_may  test  for  linear  indepen¬ 
dence  of  the  columns  of  f^+,(s)  by  computing  its  de¬ 
terminant.  Expanding  (27)  in  cofactors  along  its  bottom 
row,  we  obtain 

det  (  V ,(,)) - '2  ( -  I w W-  (28) 

y-« 

Since  R^  has  linearly  independent  columns 

hence,  the  coefficients  in  the  summation  of  (28)  are  not  all 

zero  and  thus  by  the  linear  independence  of  the  R,(s)  the 
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summation  is  not  identically  zero.  As  such,  one  can 
choose  almost  any  which  will  make  the  determinant 
of  nonzero  thus  assuring  the  Rp+\  has  linearly 

independent  columns  whe.n  its  rows  are  evaluated  at  the 
complex  frequencies  *  •  ,s.+|-  The  proof  of  the  theo¬ 
rem  is  thus  complete. 

Note  that  the  proof  of  the  theorem  yields  a  natural 
sequential  algorithm  for  choosing  test  frequencies.  More¬ 
over,  for  the  scalar  case  we  have  shown  that  the  number 
of  required  test  frequencies  is  exactly  n  —  (equal  to  the 
column  rank  of  the  Jacobian  matrix).  In  the  general  case 
where  5(i,r)  is  not  a  scalar,  the  number  of  required  test 
frequencies  is  less  than  or  equal  to  n  —  5^,  [5]. 

Although  the  theorem  implies  that  ODe  can  randomly 
choose  almost  any  r.  —  S^  test  frequencies  to  maximize 
the  solvability  of  the  fault  diagnosis  equations,  the  result 
does  not  take  cognizence  of  numerical  considerations. 
Although  no  theory  yet  exists  for  choosing  test  points  with 
numerical  considerations  in  mind,  it  has  been  our  experi¬ 
ence  that  the  “well  posedness”  of  the  fault  diagnosis 
equations  is  quite  sensitive  to  the  choice  of  test  frequen¬ 
cies  [5].  In  most  of  our  experiments,  we  have  worked  with 
real  test  frequencies  to  eliminate  the  necessity  of  working 
in  the  complex  plane.  On  the  other  hand,  m  is  most  easily 
measured  when  values  of  s,  on  the  jw  axis  are  employed 
whereas  it  has  been  suggested  that  test  frequencies  sym- 
etrically  spaced  around  a  circle  in  the  complex  plane 
might  yield  numerically  “well  posed”  equations. 

.Although  the  measure  of  solvability  5  for  the  fault 
diagnosis  equations  is  dependent  on  the  choice  of  test 
frequencies,  as  well  as  the  properties  of  the  unit  under 
test.  5^  is  determined  entirely  by  the  UUT;  its  compo¬ 
nents.  connections  and  accessible  test  points;  and  is  com¬ 
pletely  independent  of  the  test  algorithm  employed.  As 
such.  8^,„  may  be  taken  as  a  natural  treasure  of  testability 
[1]  for  the  UUT  which  characterizes  the  degree  to  which 
the  fault  analysis  equations  can  be  solved  given  an  opti¬ 
mal  choice  of  test  frequencies  and  solution  algorithm. 
Moreover,  5^,  may  be  used  as  an  aid  for  the  optimal 
selection  of  test  points  [3]-[5].  To  this  end  we  may  choose 
a  set  of  test  points,  from  several  options,  so  as  to  minimize 
Alternatively,  we  may  attribute  a  cost  to  each  input 
and  output  test  point  and  then  choose  the  least  cost 
combination  of  test  points  which  yield  a  specified 
This  latter  process  reduces  to  a  rather  straightforward 
integer  programming  problem  and  is  thus  readily  auto¬ 
mated  [4],  [5].  The  technique  is  illustrated  in  the  examples 
of  the  following  section. 

V.  Examples 

An  initial  illustration  of  the  theory  consider  the  RC 
coupled  amplifier  with  inductive  load  shown  in  Fig.  2. 
Here  we  will  take  £,  to  be  the  only  test  input  but  we  will 
initially  allow  £ ^  iL,  ic.  and  V,  to  all  be  taken  as  test 
outputs  with  the  measure  of  testability  Smm  being  used  to 
extract  a  reduced  set  of  test  outputs  from  these  options.  A 


Fig.  2.  RC  coupled  Amplifier  with  inductive  load, 
component  connection  model  for  this  circuit  is  given  by 


Rg(s) 

0 

0 

0 

v,  ' 

0 

\/LS 

0 

0 

Yl 

0 

0 

\/CS 

0 

•c 

0 

0 

0 

\/R 

Yj, 

(29) 


Vi 


0 

0 

- 1 

o' 

■ 

1 

Y0' 

1 

0 

0 

o! 

0 

>L 

0 

0 

0 

!' 

0 

Yc 

0 

0 

-1 

o| 

1 

T 

“o” 

— 6" 

o' 

T 

T. 

0 

l 

0 

oj 

0 

0 

0 

0 

It 

0 

0 

0 

-1 

o' 

i 

J 

(30) 


Taking  our  vector  of  potentially  variable  component 
parameters  to  be  r  =  col  ( fi.L.  C,R)  each  with  unity  nomi¬ 
nal  value,  we  obtain  a  nominal  transfer  function  matrix 


■*(&(•*)+  1)+  1 
j+1 

sCO 


S(s.r)> 


*+>  |  (31) 

7+7 

s 

s+  1 


whereas  our  Jacobian  matrix  evaluated  at  the  nominal 
parameter  values  is  giv  en,  by 
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Now.  an  inspection  of  this  matrix  will  reveal  that  it  has 
four  independent  columns  ever  tile  field  of  complex  num¬ 
bers  and  hence  if  all  four  possible  outputs  are  used,  we 
will  have  implying  that  the  fault  diagnosis  equa¬ 

tions  have  locally  unique  solutions.  On  the  other  hand,  if 
only  two  outputs  £„  and  ic  are  measured,  our  modified 
Jacobian  matrix  will  reduce  to  the  first  and  third  rows  of 
the  matrix  shown  in  (32)  which  has  column  rank  3-  As 
such,  if  we  only  use  these  two  test  outputs,  we  obtain 

<S _ - 1  and  hence  the  solution  to  the  fault  diagnosis 

equations  will  be  characterized  by  a  single  arbitrary 
parameter. 

In  this  latter  case,  with  only  £0  and  ic  taken  as  test 
outputs.  Theorem  2  implies  that  dF / dr  will  have  rank  3 
for  almost  any  choice  of  3  —  n  -  S ^  test  frequencies. 
Choosing  r,  -  I,  r,  »  2,  and  s3  =»  3,  we  obtain 
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which  has  three  linearly  independent  columns  as  long  as 
g(l)s*0.  g(2)^0,  and  #(3)^0.  Indeed,  in  this  example, 
any  two  of  the  three  frequencies  would  have  sufficed  to 
yield  three  linearly  independent  columns.  Note,  for  scalar 
transfer  functions,  Theorem  2  implies  that  n  -  5„,„ 
frequencies  are  actually  required  but  for  matrix  transfer 
functions  fewer  frequencies  may  suffice. 

Of  course,  for  the  circuit  of  Fig.  2,  we  have  a  choice  of 
some  15  combinations  of  the  four  outputs  with  which  we 
may  choose  to  work  for  the  diagnosis  of  the  circuit.  The 
resultant  5m' s  for  the  various  combinations  of  outputs 
are  given  in  Table  I  (5]. 

Finally,  with  the  aid  of  Table  I,  one  may  readily  de¬ 
velop  a  test  point  selection  algorithm  for  our  circuit  [4], 
[5].  For  instance,  if  we  desire  to  find  the  smallest  set  of 
outputs  which  yield  a  Smia  <  1  an  inspection  of  the  table 
will  reveal  that  E0  and  iL,  iL  and  ic,  or  E0  and  ic  are  the 
optimal  choices.  Of  course,  if  one  attributes  a  cost  to  the 
various  outputs  (determined  by  the  convenience  of 
making  the  required  measurements),  then  we  may  further 
distinguish  between  these  three  possibilities.  For  instance, 
if  voltage  measurements  are  deemed  to  be  easier  than 
current  measurements,  the  combination  of  iL  and  ic  may 
be  excluded  with  the  decision  between  the  remaining  two 
options  being  dependent  on  whether  it  is  easier  to 
measure  the  circuit’s  input  current  (»'c)  or  its  load  current 

OJ¬ 
AS  a  second  example,  consider  the  one  stage  transistor 
amplifier  shown  in  Fig.  3  with  the  ac  equivalent  circuit  of 
Fig.  4.  Since  it  is  clearly  impossible  to  distinguish  between 
failures  in  the  two  parallel  bias  resistors,  R m  and  R„,  these 
two  resistors  have  been  combined  into  the  single  resistor, 


Tas:  E ! 

M.-usvx?  nt  Tr-7T\  :r>  rat  Cv-  .it  -  ■  2  L <-•*; 

Vahsocs  Ci?W!>  ATKINS  I.F  Tl  <1  O' X  ,  •  .. 


Outputs  |  4_;_ 


Fig.  3.  One  stage  transistor  amplifier. 


Fig.  4.  Amplifier  equivalent  circuit. 


R'  in  the  component  connection  model  of  (34)  and  (35). 
Taking  all  of  the  component  parameters  as  potentially 
faulty,  r  becomes  a  12  vector  composed  of  ,RL 

and  as  before,  we  take  all  parameters  to  have  the  nominal 
value  of  unity. 
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tabulated  m  Table  II  [5]. 
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From  the  table  it  is  apparent  that  no  single  test  output 
suffices  to  yield  a  5^“  0  (perfect  testability)  though 
-0  can  be  achieved  using  two  test  outputs;  VQ  and  Ic j 
or  V0  and  I,. 

VI.  Conclusions 

Our  purpose  in  the  oreceeding  has  been  to  formulate  an 
analytic  theory  in  support  of  the  intuitive  an  usually 
associated  with  the  design  of  a  test  algorithm.  With  the 
aid  of  the  techniques  developed  above,  we  believe  that  it 
will  be  possible  to  develop  an  automated  test  program 

6/L 


generation  (ATPG)  algorithm  for  linear  systems  [4],  [5). 
Indeed,  such  an  algorithm  could  be  readily  combined  with 
the  same  computer-aided  design  (CAD)  algorithm  used  in 
the  system  design  process  [9].  Given  the  component  con 
nection  equations  such  an  algorithm  could  be  employed  to 
automatically  (or  interactively)  choose  test  points  and  test 
frequencies  and  generate  the  required  set  of  fault  diagno¬ 
sis  equations.  These  could  then  be  stored  on  tape  and 
suppbed  to  the  automatic  test  equipment  (ATE)  in  which 
a  faulty  system  would  be  tested  and  the  fault  diagnosis 
equations  solved.  J 
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Although  we  do  not  propose  to  discuss  the  actual 
solution  of  the  fault  diagnosis  equations  here,  it  should  be 
pointed  out  that  by  assuming  that  relatively  few  compo¬ 
nents  have  failed,  say  p<Crt,  it  is  possible  to  develop 
specialized  algorithms  for  the  solution  of  the  fault  diagno¬ 
sis  equations  which  are  far  more  efficient  than  standard 
equation  solvers  in  this  application  [7],  [1 1],  [12].  These  are 
typically  derived  from  the  fault  simulation  algorithms  used 
in  the  diagnosis  of  digital  systems  and  may  naturally  be 
classified  into  “ simulation  before  test"  and  “ simulation 
after  test ”  algorithms.  Some  of  the  algorithms  are  dis¬ 
cussed  in  [7]  and  [9]— [  II]- 

Finally,  we  note  that  as  formulated  above,  the  measure 
of  testability  5^,,  assumes  that  any  combination  compo¬ 
nent  failure  is  possible.  If.  however,  we  assume  that  at 
most  />«/i  components  fail  simultaneously,  the  ambiguity 
in  the  solution  of  the  fault  diagnosis  equations  may  actu¬ 
ally  be  less  than  3„,„.  For  instance,  in  the  example  of  Fig. 
3,  with  only  V0  taken  as  an  output  S^-3,  yet  the  fault 
diagnosis  equations  can  be  solved  exactly  if  we  assume 
that  only  one  parameter  is  out  of  tolerance  [10].  The 
point,  here,  is  that  even  though  the  solution  of  the  fault 
diagnosis  equations  in  n  space  has  three  arbitrary  parame¬ 
ters  when  the  solution  is  restricted  to  the  one  dimensional 
manifold  of  parameter  vectors  in  which  all  but  one  coor- 
dinant  are  nominal  it  is  unique. 
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A  Search  Algorithm  for  the  Solution  of  the 
Multifrequency  Fault  Diagnosis  Equations 

H.  S.  M.  CHEN  and  RJCHARD  SAEKS 

Abstract—  A  search  algorithm  for  the  solution  of  the  fault  dlagnosb 
equations  arising  in  linear  time  invariant  analog  circuits  and  systems  b 
presented.  By  exploitation  of  Householder's  formula  an  efficient  algorithm 
whose  computational  complexity  b  a  funcdoo  of  the  number  of  system 
failures  rather  than  the  number  of  system  components  b  obtained 

I.  Introduction 

Conceptually,  the  fault  analysis  problem  for  an  analog  circuit 
or  system  amounts  to  the  measurement  of  a  set  of  externally 
accessible  parameters  of  the  system  from  which  one  desires  to 
determine  the  internal  system  parameters.  To  solve  the  fault 
diagnosis  problem,  one  then  measures  a  vector  of  external 
parameters,  m  — col  (m,),  and  solves  a  nonlinear  algebraic  equa¬ 
tion 

m  "  F(r)  (1) 

for  a  vector  of  internal  system  parameters,  r— col  (r,),  to  di¬ 
agnose  the  fault  For  linear  time-invariant  systems  the  function 
F  can  be  expressed  analytically  (14).  More  generally,  in  the 
nonlinear  case,  one  can  evaluate  F(r)  for  any  given  parameter 
vector  r  with  a  simulator,  and  thus  solve  (1)  numerically,  even 
though  F  has  no  analytic  expression. 

Although  one  does  opt  usually  formulate  the  fault  diagnosis 
problem  in  terms  of  the  above  described  equation  solving  nota¬ 
tion,  this  formulation  is  equivalent  to  the  classical  fault  simula¬ 
tion  concept  [9].  Indeed,  fault  simulation  is  simply  a  search 
algorithm  for  solving  (1).  Here,  one  precomputes  for 

each  allowable1  faulty  parameter  vector  f  and  then  compares  the 
measured  m  with  the  simulated  /ft's,  stored  in  a  fault  dictionary, 
to  solve  (1). 
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Although  the  above  described  approach  to  fault  simulation 
has  been  successful2  when  applied  to  digital  system,  there  is 
considerable  question  surrounding  its  applicability  to  analog 
circuits  and  systems  [1].  The  problem  here  is  two  fold.  First, 
rather  than  simply  failing  as  a  one  or  zero,  an  analog  parameter 
has  a  continuum  of  possible  failures.  Second,  unlike  a  digital 
system  wherein  a  component  is  either  good  or  bad,  in  an  analog 
system,  a  component  parameter  is  either  in  tolerance  or  out  of 
tolerance.  As  such,  for  each  hypothesized  failure,  it  may  prove 
necessary  to  do  an  entire  family  of  Monte  Carlo  simulations  in 
which  the  values  of  the  good  components  are  randomly  chosen 
within  their  tolerance  limits.  Although,  at  the  present  time  we 
have  insufficient  practical  experience  to  determine  the  precise 
number  of  fault  simulations  required  for  analog  fault  diagnosis, 
it  is  estimated  that  the  number  of  simulations  required  for  an 
analog  system  will  exceed  the  number  of  simulations  required 
for  a  digital  system  of  similar  complexity  by  a  factor  ranging 
between  two  and  six  order  of  magnitude  [1].  As  such,  the  fault 
simulation  concept  which  has  proven  to  be  so  successful  for  a 
digital  system  may  not  be  applicable  in  the  analog  case. 

As  an  alternative  to  fault  simulation,  one  may  adopt  one  of 
the  more  classical  equation  solving  algorithms  for  the  solution  of 
(1)  [2J,  [3].  Here,  one  first  measures  m  and  on  the  basis  of  this 
measurement,  makes  an  initial  guess  r°  (usually  taken  to  be 
nominal  parameter  vector)  at  the  solution  of  the  equations.  One 
then  evaluates  ma  -  F(r°)  and  compares  it  with  m.  If  m°«  m,  r° 
is  the  solution  to  the  fault  diagnosis  equation.  If  not,  one  makes 
a  new  “educated"  guess  at  the  solution  rl  (usually  based  on  the 
deviation  between  m  and  m°)  and  repeats  the  process  by  evaluat¬ 
ing  m1-  F(r‘)  and  comparing  it  with  m.  Hopefully,  sequence  of 
component  parameter  vectors  r‘  and  simulated  data  vectors, 
m' •  F(r'),  is  obtained  which  “quickly"  converges  to  r  and  m, 
respectively.  Since  the  evaluation  of  F(r')  is  essentially  equiv¬ 
alent  to  the  simulation  of  the  system  with  the  faulty  parameter 
values  r'  this  technique  is  really  another  form  of  fault  simula¬ 
tion.  In  this  case,  however,  one  simulates  the  system  after  the 
data  vector  has  been  measured  and  uses  this  data  to  make  an 
educated  guess  at  a  (hopefully)  small  number  of  parameter 
vectors  at  which  the  system  should  be  simulated.  As  such,  the 
approach  has  been  termed  simulation  after  test  [1)  to  distinguish 
it  from  the  classical  approach  wherein  all  simulation  is  done 
before  test  [1], 

At  the  time  of  this  writing,  both  approaches  are  under  study 
[1],  neither  of  which  have  been  shown  to  be  superior.  Fault 
“simulation  after  lest”  requires  that  one  include  an  efficient 
simulator  in  the  ATE  itself,  which  can  be  used  for  on-line 
computation  of  m‘ «  F\r,)  after  the.UUT  has  been  measured.  On 
the  other  hand,  simulation  after  test  eliminates  the  requirement 
of  searching  a  large  fault  dictionary  for  the  (approximate)  data 
matches  required  by  “simulation  before  test"  In  addition,  the 
complex  ATPG  requirement  for  “simulation  before  test”  is 
eliminated. 

To  make  “simulation  after  test”  feasible,  however,  an  efficient 
equation  solving  algorithm  is  required  to  obtain  convergence  of 
the  r'  sequence  in  a  reasonable  amount  of  time.  Moreover,  since 
“real  world”  failures  in  analog  circuits  and  systems  often  take 
the  form  of  open  and  short  circuited  components  or  large 
parameter  deviations  from  nominal  the  classical  penurbational 
algorithms  a  la  Newton-P.aphson  are  inapplicable.  Fortunately, 


:Mo«  mduairial  vucn  at  ATE  obuue  uuslactorv  fault  detection  in  digital 
circuit!  via  fault  sttnulauoo  techniques  but  require  guided  probe  techniques  in 
addiuon  to  the  fault  dictionary  data  for  fault  due  pi  oats  fiaolauon). 


in  the  confext  of  the  fault  diagnosis  problem,  one  can  reasonably 
assume  that  relatively  few  component  parameters  have  failed.  As 
such,  even  though  it  is  not  valid  to  assume  that  r  —  r°  (the 
deviation  of  r  from  nominal)  is  small  in  norm,  it  is  reasonable  to 
assume  that  it  is  small  in  “rank.”  The  purpose  of  the  present 
paper  is  to  formulate  a  search  algorithm  for  the  solution  of  the 
fault  diagnosis  equations  which  exploits  such  an  assumption. 

In  the  remainder  of  the  introduction,  the  explicit  form  for  the 
fault  diagnosis  equations  arising  in  linear  lime-invariant  circuits 
and  systems  derived  in  [3]  and  [14]  is  reviewed.  Householder’s 
formula  [4]  is  then  used  to  exploit  the  special  form  of  these 
equations  in  combination  with  an  assumption  that  r  differs  from 
r°  in  relatively  few  coordinants  to  formulate  a  search  algorithm 
for  the  solution  of  the  fault  diagnosis  equations  in  which  the 
computational  complexity  of  the  simulation  process  is  a  function 
of  the  number  of  the  failures  rather  than  the  number  of  compo¬ 
nents.  This  algorithm  is  based  on  a  similar  algorithm  suggested 
by  Temes  [5]  for  “simulation  before  test”  and  a  large-change 
sensitivity  algorithm  first  given  by  Leung  and  Spence  [6].  Fi¬ 
nally,  examples  of  the  application  of  the  algorithm  are  presented 
and  a  study  of  the  robustness  of  the  algorithm  to  deviations  of 
the  “good"  components  from  their  nominal  values  is  presented 
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In  the  case  of  a  linear  time-invariant  circuit  or  system,  the 
fault  diagnosis  equations  may  be  expressed  explicitly  in  analyti¬ 
cal  form  [3].  [14],  Indeed,  it  is  the  explicit  nature  of  this  form 
which  makes  our  simplified  solution  algorithm  possible.  Using  a 
“component  connection  model”  as  the  starting  point  for  the 
derivation  of  the  fault  diagnosis  equations  [8].  The  system  com¬ 
ponents  are  characterized  by  a  set  of  simultaneous  equations 

b,~Z,(s,r)a„  «  — 1,2,-  •  •  ,n  (2) 

where  a,  and  b,  denote  the  component  input  and  output  vectors, 
respectively,  r  is  our  vector  of  internal  system  parameters  which 
characterizes  the  “fault  state"  of  the  various  components,  and  r 
is  the  complex  frequency  variable.  For  notational  brevity,  these 
component  equations  are  combined  into  a  single  block  diagonal 
matrix  equation 

b~Z(s,r)a  (3) 

where  6-col  (6,).  a -col  (a,),  and  Z(r,r)-d:ag  (Z,(s,  r)). 

The  connection  equations  for  the  model  take  the  form 

a  — Lxiu 

y  -  L2,b  +  L~.u  (4) 

where  u  and  y  represent  the  vectors  of  accessible  test  inputs  and 
outputs  which  are  available  to  the  test  system.  By  combining  (3) 
and  (4)  to  eliminate  the  component  input  and  output  variables  a 
and  b,  one  may  derive  [3],  [8],  [14]  an  expression  for  the  transfer 
function  matrix  observable  by  the  test  system  between  the  test 
input  and  output  vectors  u  and  y  obtaining 

S(i,r)-  La+  LjtO  ~  Z(r.r)Lu)_,Z(r,r)L12  (5) 

where 

y-5(s,r)«.  (6) 

For  a  linear  time-invariant  system  the  transfer  function  S(s,r) 
is  a  complete  description  of  the  measurable  data  about  the  UUT 
available  to  the  test  system.  Moreover,  being  rational  it  is 
completely  determined  by  its  value  at  a  finite  number  of 
frequencies.  As  such,  without  loss  of  generality  we  may  take  our 
vector  of  measured  data  to  be  of  the  form 

m-col  (5(i„r),J(r1,r),  ••,S(r*,r)J.  (7)  , 
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The  fauit  equations  then  take  ihe  farm 

*(»,.')' 

S(sj,r) 

m  — 

S(-V') 

LjjT-Lh  (l -  Z(j(, O^-ii  )  Z  (Jiif  )^-iz 

La+L^t  (1  -Z(sj,r)Lu  )~'Z  {svr  )La 

Lu  +  L..x  (I  -Z(sk,r)Lu  )~lZ  ( sk,r  )Ll3 

~  f(r).  (8) 

In  the  present  context  we  will  assume  that  *,,.**•  •  ■  ,sk  represent 
sufficiently  many  frequencies  to  permit  the  fault  diagnosis  equa¬ 
tions  to  be  solved.  Indeed,  algorithms  for  determining  such  a  set 
of  frequencies  when  they  exist  are  given  in  [10]— [12],  and  [14). 
The  problem  at  hand  is  the  development  of  an  efficient  algo¬ 
rithm  for  the  solution  of  these  fault  diagnosis  equations. 

II.  Householder's  Formula  and  the  Search 
Algorithm 

Given  the  explicit  form  of  fault  diagnosis  equations  of  (8),  it  is 
apparent  that  the  vast  majority  of  the  computation  required  for 
the  simulation  of  F(r),  either  before  or  after  test,  is  the  inversion 
of  the  family  of  matrices;  (1  -  Z{st,r)Lu),  /- 1,2,- •  •  ,At.  For¬ 
tunately,  given  the  assumption  that  relatively  few  components 
have  failed,  i.e.,  that  r  differs  from  its  nominal  value  r°  in  only  a 
small  number  of  coordinents,  Householder’s  formula  [4]  may  be 
invoked  to  compute  (1  —  Z(i(, r)Lxi)~l  in  terms  of  (1 — 
Z(si,r°)LKX)~'  together  with  the  inversion  of  a  small  dimensional 
matrix.  More  precisely,  if  A,  B,  C,  and  D  are  given  matrices  of 
dimension  nxn,  nXn,  nXp,  and pXn,  respectively,  where 

A-B+CD  (9) 

then 

ai -  [l  —  ^  -  '<2(1  -i-  x>a (io) 

As  such,  once  B  _l  is  known,  one  may  compute  the  inverse  of 
the  nXn  matrix  .4  in  terms  of  B  ~ 1  and  the  inverse  of  the pXp 
matrix  (I  +  DB  ~'C).  This  technique  has  been  used  effectively 
for  large  change  sensitivity  analysis  [6]  and  has  recently  been 
suggested  by  Temes  for  application  to  fault  simulation  [5].  This 
is  achieved  by  exploiting  the  block  diagonal  character  of  Z(s,r). 
Thus  if  r  differs  from  r°  in  q  coordinents  Z(s,r)  will  differ  from 
Z(r,  r°)  only  in  the  p  x  p  block  composed  of  components  which 
are  effected  by  the  faulty  parameters.1  If  the  rows  and  columns 
of  Z(s,r)  are  reordered  so  that  this  block  appears  in  the  upper 
left  corner  of  Z(s,r)  then. 

Z(v)-Z(j„/’#)  +  J"-j”-J  (11) 

where  A  i s  pXp  and  Z(s.r)  is  n X  n.  We  then  have 

(I  —  2(s„r)Lu)m(L—  Z(r(,r°)L|,)+  if,  (12) 


(l-Z(r(.r0)Ini)-‘.  (13) 

Although  quite  complex,  the  only  major  matrix  computation 
required  for  the  inversion  of  (1  -  Z(ilar)£.u)  via  13  is  the  inver¬ 
sion  of  the  pXp  matrix  in  parentheses.  As  such,  as  long  as  the 
number  of  faulty  parameter  values  remains  small,  (13)  represents 
an  extremely  efficient  means  of  canying  out  a  large  number  of 
fault  simulations  with  relatively  little  computational  capacity. 
Although  Temes  originally  suggested  the  technique  in  the  con¬ 
text  of  a  “simulation  before  test”  algorithm,  the  above  applica¬ 
tion  of  Householder’s  formula  is  ideally  suited  for  “simulation 
after  test.”  wherein,  it  reduces  the  computational  requirements 
for  the  simulation  process  to  well  within  the  capabilities  of  the 
minicomputers  usually  found  in  modern  ATE. 

Although  Householder’s  formula  yields  an  efficient  means  for 
solving  the  fault  diagnosis  equations  once  the  faulty  parameters 
have  been  determined,  it  remains  to  locate  the  set  of  faulty 
parameters.  Fortunately,  the  efficiency  of  the  solution  algorithm 
based  on  Householder's  formula  is  such  that  one  can  justify  a 
search  through  “all”  allowable  sets  of  faulty  parameters  to  locate 
the  actual  failures.  Indeed,  if  we  denote  the  “reduced  fault 
diagnosis  equations”  in  which  all  component  values  are  as¬ 
sumed  to  be  nominal  except  for  q  specified  parameters; 
'Vtiy'iffl.' ' '  by  ^ta«».-.4«>  lhen  lhe 


will  have  a  solution  if  and  only  if  the  faulty  parameter  values  are 
among  the  rl(1)1rl(1),.  -  -  ,rw  As  such,  if  one  attempts  to  solve 
(14)  for  each  allowable  family  of  faulty  parameters,  the  actual 
fault  will  be  indicated  by  the  existence  of  a  solution  to  the 
equation. 

Although  such  a  search  algorithm  might  at  first  seem  to  be 
highly  inefficient,  when  one  observes  that  with  the  aide  of 
Householder's  formula,  the  evaluation  of  F<ix<rt). •••.*«>  require* 
only  the  inversion  of  p  Xp  {pas  q)  matrix  it  is  seen  that  this  is  not 
the  case.  Moreover,  if  one  searches  for  the  most  likely  failures 
first,  relatively  few  equations  need  be  solved  in  practice.  In 
actual  implementation  in  a  “simulation  sfter  test”  algorithm,  one 
can  readily  search  through  all  possible  combinations  of  one,  two, 
or  three  simultaneous  failures,  and  commonly  encountered  com¬ 
binations  of  larger  numbers  of  failures,  thus  locating  the  far 
majority  of  failures  in  a  reasonable  amount  of  ATE  time. 

An  alternative  formulation  of  the  search  algorithm  which 
alleviates  the  numerical  difficulties  associated  with  the  attempt 
to  solve  a  set  of  equations  which  may  not  have  a  solution  (as  is 
the  case  whenever  one  attempts  to  solve  (14)  with  the  wrong 
choice  of  faulty  parameters)  is  to  employ  an  optimisation  algo¬ 
rithm.  rather  than  an  equations  solver,  to  minimi?* 

A  i  h  iov.  •  ■  •.  ,<* ,(  r,<  i  j,  r,(2),  •  •  •  ,rK<)) 


where  Lf,  denotes  the  upper  (after  reordering)  p  rows  of  L, (. 


JHere.  f  IS  the  ium  of  th«  dimensions  of  all  the  blocks  of  Z(t.r)  which  «« 
dependent  on  the  q  coordinents  to  which  f  differ*  from  r°.  Typ«c«ily.  qmp 
with  the  cuct  retsuonsmp  depending  the  block  tua. 


*  llm~  --.««)( 'klV/’xlfc' '  ’  »rKw»)H2-  ('^) 

Since  (15)  has  a  zero  minimum  if  and  only  if  (14)  has  a  solution 
a  search  through  the  minimization  of  (15)  for  all  allowable  sets 
of  faulty  parameters  will  also  locate  the  faulty  parameters  (indi¬ 
cated  by  a  zero  minimum). 
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Fig.  I.  LC  filter. 


Examples 

As  a  first  example,  consider  the  LC  filter  shown  in  Fig.  1  for 
which  'he  component  connection  model  takes  the  form 
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Fig.  3.  Amplifier  equivalent  circuit. 


Since  we  assume  that  the  source  and  load  resistors  are  external 
to  the  filter  and  do  not  fail  they  have  been  imbedded  into  the 
connection  equations  and  thus  do  not  appear  explicitly  as  com¬ 
ponents.  The  filter  components  are  assumed  to  have  the  nominal 
values 

C,-10  L,-20  C2  —  30  and  Lj-40  (18) 

and  it  is  assumed  that  no  more  than  one  component  fails  at  a 
time  (though  the  failure  may  be  catastrophic).  Our  “simulation 
after  test”  fault  diagnosis  algorithm  then  requires  that  we  mini- 
mize  /|(C|),  J^C^,  and  The  performance 

measure  with  zero  minimum  then  represents  the  failed  compo¬ 
nent  with  the  minimizing  value  for  that  performance  measure 
representing  the  value  of  the  failed  component.  All  other  compo¬ 
nent  values  must  then  be  nominal  (since  it  is  assumed  that  only 
one  component  fails).  Note  that  the  minimizing  value  for  the 
nonzero  J's  does  not  correspond  with  the  correct  component 
values  for  those  components. 

This  filler  was  simulated  with  each  of  its  four  components  out 
of  tolerance  (by  as  much  as  100  percent)  with  the  search  algo¬ 
rithm  being  applied  to  the  simulated  data.  Since  only  one 
parameter  is  assumed  to  fail  at  a  time  and  Z(s,r)  is  diagonal 
each  of  the  four  required  minimizations  was  carried  out  bv 
purely  scalar  operations  using  a  Golden  Section  search.  In  all 
four  cases  the  fault  was  correctly  located  with  the  faulty  parame¬ 
ter  value  being  determined  “exactly.”  The  resultant  data  is 
summarized  in  Table  I.  Note:  in  each  case  the  minimum  value 
for  J,  for  the  faulty  component  is  at  least  three  orders  of 
magnitude  lower  than  the  minimum  value  J,  for  any  nonfaulty 
component  As  such,  the  failure  is  easily  located  and  one  can 
expect  the  algorithm  to  remain  viable  in  the  face  of  numerical 
and  or  approximation  error. 

As  a  more  sophisticated  example,  consider  the  one  stage 
transistor  amplifier  of  Fig.  2  and  its  wide  band  equivalent  circuit 
shown  in  Fig  3.  Note  that  the  parallel  resistors  A.  and 
appearing  in  this  model  have  been  lumped  together  into  a  single 
resistance  R,  since  it  is  clearly  impossible  to  distinguish  between 
failures  in  these  two  components  from  external  measurements. 


TABLE  I 

Fault  Analysis  for  LC  Filter 
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(20)  ‘ 


The  component  and  connection  equations  for  this  circuit  are 
given  by  (19)  and  (20)  and  the  nominal  values  for  the  component 
parameters  are  taken  to  be 

C,«20  r,-10  r,  -40  C.-25  Cj-20  /?,- 75 

/?,  ”30  C.-13  C.-IO  ju-10  10  /?,  —20. 

(21) 

As  before,  it  was  assumed  that  no  more  than  one  component 
failed  and  J,(rJ  was  minimized  for  each  of  the  12  component 
parameters.  Once  again  the  failure  was  clearly  located  by  the 
smallest  minima  with  accurate  determination  of  the  faulty 
parameter  value.  Indeed,  in  each  case,  the  minimum  value  of 
7,(r,)  for  the  faulty  parameter  value  is  at  least  five  orders  of 
magnitude  less  than  the  minima  for  die  remaining  As  such, 
there  is  no  ambiguity  whatsoever  in  the  determination  of  the 
faulty  component  and  its  value  even  though  the  component 
parameters  have  been  allowed  to  deviate  from  their  nominal 
values  by  as  much  as  500  percent 


rithm  for  fault  diagnosis  was  applied  to  the  transistor  amplifier 
using  simulated  measurements  in  which  one  component  was  out 
of  tolerance  (taken  to  be  10  percent)  and  the  remaining  compo¬ 
nent  parameters  were  in  tolerance  but  not  equal  to  their  nominal 
values  [7],  Of  course,  the  nominal  values  are  used  to  define  the  Ft 
since  the  actual  value  of  the  good  components  is  unknown.  Not 
surprisingly,  this  results  in  some  ambiguity  in  the  diagnosis 
process  since  J,{rt)  can  never  be  reduced  exactly  to  zero.  As 
such,  our  simulation  yielded  good  though  not  perfect  results.  In 
particular,  the  algorithm  correctly  located  the  fault  in  71  percent 
of  the  trials  with  an  ambiguity  group  of  one  in  50  percent  of 
these  cases  and  ambiguity  groups  of  two,  three,  and  four  in  the 
remaining  cases.  Since  all  of  the  good  components  in  this  simu¬ 
lation  were  taken  to  be  at  the  limit*  of  their  tolerance  interval, 
these  results  actually  represent  a  worst  case  situation.  As  such, 
we  believe  that  the  search  algorithm  will  yield  significantly  better 
results  in  a  "real  world"  situation,  wherein  most  of  the  compo¬ 
nents  will  have  near  nominal  values  with  relatively  few  of  the 
good  component  parameters  lying  near  their  tolerance  limits. 


Robustness 

Unlike  the  case  of  fault  diagnosis  in  a  digiul  system  wherein  a  H^brid  Algorithms 
component  is  unambiguously  good  or  bad,  in  an  analog  circuit  Although  the  terminology  has  only  recently  been  formulated 
or  system,  a  component  ptr ameter  is  either  in  tolerance  or  out  of  [IL  most  of  the  algorithms  which  have  been  proposed  over  the 
tolerance.  As  such,  any  fault  diagnosis  algorithm  which  makes  yean  for  the  solution  of  the  fault  analysis  problem  in  analog 
use  of  the  nominal  component  parameters  must  be  tested  for  circuits  and  systems  can  naturally  be  categorized  as  either 
robustness,  i.c^  how  effective  is  the  algorithm  at  locating  the  -simulation  before  test"  or  "simulation  after  test"  algorithms  [91. 
faulty  components)  when  the  good  components  are  noi  pte-  Although  the  preceding  development  has  been  presented  in  the 
ctsely  equal  to  their  nominal  values,  A*  *och.  our  context  of  a  "simulation  after  test"  algorithm,  many  of  the 
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techniques,  such  as  the  application  of  Householder’s  formula  [5] 
are  also  applicable  to  “simulation  before  test"  algorithms.  In¬ 
deed,  the  techniques  are  ideally  suited  to  a  hybrid  algorithm. 
Here,  one  would  employ  a  two-pass  diagnostic  algorithm 
wherein  the  measured  data  vector  m  is  first  compared  with 
presimulated  data  stored  in  a  fault  dictionary.  If  the  fault  is  so 
located,  the  diagnosis  process  is  terminated.  If  the  fault  is  not 
located  among  those  which  have  been  presimulated  and  stored 
in  the  fault  dictionary,  the  hybrid  algorithm  will  then  revert  to  a 
“simulation  after  test”  mode  until  a  sequence  of  parameter 
vectors  rt  and  simulated  data  vectors  m,  have  been  computed 
which  converge  to  the  solution  of  the  fault  diagnosis  equations. 
At  the  same  time  the  results  of  each  of  these  “after  test" 
simulations  are  stored  in  the  fault  dictionary  for  use  in  future 
applications  of  the  test  algorithm.  As  such,  a  fault  dictionary  is 
slowly  built  up  which  includes  simulations  of  those  failures 
which  are  most  commonly  encountered  in  actual  practice.  Such 
a  hybrid  algorithm  would  seem  to  achieve  the  best  of  both 
worlds.  Common  faults  would  be  found  quickly  on  the  first  pass, 
yet  the  system  would  still  have  the  “simulation  after  test”  algo¬ 
rithm  upon  which  to  fall  back  when  encountering  a  new  failure 
mode.  Moreover,  ATPG  requirements  would  be  greatly  reduced 
with  only  the  most  common  faults  (say  open  and  short  circuits, 
single  failures,  etc.),  being  presimulated  aDd  the  remainder  of  the 
fault  dictionary  being  adaptively  generated  by  the  “simulation 
after  test”  algorithm  as  new  fault  modes  are  encountered.  Such  a 
hybrid  scheme  alleviates  the  necessity  of  determining  the  fault 
modes  of  a  system  in  advance,  as  required  for  “simulation  before 
test”  while  simultaneously  eliminating  the  duplicate  simulations 
of  common  faults  required  for  “simulation  after  test.” 

III.  Conclusions 

Our  purpose  in  the  preceding  has  been  the  formulation  of  a 
class  of  techniques  which  we  believe  can  serve  as  the  basis  of  an 
effective  algorithm  for  fault  diagnosis  in  linear  analog  circuits 


and  systems.  These  techniques  have  proven  to  be  effective  in  the 
situation  where  all  good  component  parameters  are  “near" 
nominal  and  give  promise  of  sufficient  robustness  to  cope  with 
the  “real  world”  situation,  in  which  the  good  component  param¬ 
eters  are  in  tolerance  though  not  nominal. 

Although  the  presentation  has  been  formulated  in  the  context 
of  a  “simulation  after  test"  algorithm,  the  techniques  presented 
are  also  applicable  to  “simulation  before  test"  and  hybrid  algo¬ 
rithms. 
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Failure  Prediction  for  an  Qn-Line  Maintenance 
System  in  a  Poisson  Shock  Environment 

K.  S.  LU  and  R.  SAEKS.  fellow,  ieee 

Abstract— A  failure  prediction  algorithm  for  application  in  a 
periodic  on-line  maintenance  system  operating  in  a  Poisson  shock 
environment  is  described.  The  system  under  test  is  measured  at 
periodic  maintenance  intervals  with  the  data  derived  therefrom  being 
used  to  estimate  system  lifetime  and  determine  an  optimal  replace¬ 
ment  time.  The  resultant  algorithm  is  simulated  and  compared  with 
various  fixed  replacement  schedules. 

I.  Introduction 

Although  considerable  effort  has  been  expended  during  the  past 
decade  to  develop  techniques  for  fault  detection  and  diagnosis  in 
both  analog  and  digital  electronic  circuits  [10],  little  attention  has 
been  given  to  the  possibility  of  formulating  algorithms  for  fault 
prediction.  To  accurately  predict  a  fault,  a  device  must  be  tested  at 
periodic  maintenance  intervals.  If  the  device  fails  or  does  not 
operate  correctly,  it  is  replaced  immediately.  The  device  may  be 
assumed  good  if  its  characteristics  are  in  tolerance.  However,  if 
the  characteristics  are  slightly  off  nominal  but  the  device  still 
operates  correctly,  one  can  attempt  to  predict  if  the  device  will  fail 
before  the  next  scheduled  maintenance  interval.  If  device  failure  is 
predicted,  it  can  be  replaced  before  failure  occurs  as  part  of 
planned  preventative  maintenance. 


Manuscript  received  April  3.  1978:  revised  September  18.  1978  and  February  1. 
1979  This  work  was  supported  in  part  by  the  Joint  Services  Electronics  Proram  at 
Tesa,  Tech  University  under  ONR  Contract  76-C-II36 
R  Saeks  is  with  the  Department  of  Electrical  Eneineenna.  Tesa,  Tech  University 
Lubbock.  TX  79409 

K  S.  Lu  is  with  the  Teaaa  Instruments  Incorporated.  Dallas.  TX. 


73 


00 1 8-9472/79 /0600-O356S00. 75  ©  1979  IEEE 


Ill*  TRANSACTIONS  ON  SYSTEMS.  MAN.  AND  CYBERNETICS,  VOL.  SMC-9,  NO.  6.  JUNE  1979 


With  the  advent  of  the  low-cost  microprocessor,  on-line  fault 
prediction  is  possible  and  practical  [9],  For  this  purpose,  a  curve 
fitting  algorithm  for  on-line  fault  prediction  was  first  introduced 
by  Saeks.  Liberty,  and  Tung  [11]-{13]  in  1975.  The  disadvantage 
of  this  algorithm,  however,  is  that  the  second-order  polynomial 
model  employed  is  too  simple  to  describe  the  aging  curve  of  a 
real-world  component.  Employing  the  Poisson-shock  model  for 
the  wear  process  introduced  by  Esary,  Marshall,  and  Proschan 
[1],  [2],  [6],  another  curve  fitting  fault  prediction  algorithm  which 
overcomes  these  disadvantases  is  discussed  in  the  present  paper 

in 

In  the  following  section  a  model  for  the  failure  dynamics  of  a 
system  component  parameter  is  formulated.  Here  ft  is  assumed 
that  the  failure  is  due  to  the  component  being  subjected  to  a 
sequence  of  Poisson  distributed  shocks  [3],  [7],  with  the  measur¬ 
able  parameter  being  controlled  by  an  unknown  difference  equa¬ 
tion  whose  underlying  discrete  "component  time"  process  is 
defined  by  the  number  of  shocks  to  which  the  component  has 
been  subjected.  Since  both  the  failure  dynamics  (i.e.,  the  difference 
equation)  and  the  relationship  between  "component  time"  and 
real  time  are  unknown,  our  failure  model  is  doubly  stochastic.  The 
third  section  of  the  paper  is  devoted  to  the  formulation  of  an 
algorithm  for  estimating  the  component  failure  dynamics,  and  its 
“lifetime"  is  defined  to  be  the  number  of  shocks  required  to  cause 
component  failure.  This  is  followed  by  the  formulation  of  an 
“optimal"  replacement  theory  wherein  the  optimal  real  time  at 
which  to  replace  a  component  is  computed  in  terms  of  its 
estimated  “lifetime."  Finally,  the  results  of  a  simulation  of  the 
algorithm  in  both  an  ideal  and  noisy  environment  are  presented 
and  compared  with  the  simulated  performance  for  several  fixed 
replacement  schedules. 

II.  Failure  Dynamics 

Let  C(iV)  represent  values  of  a  particular  component 
parameter,  where  the  "component  time"  S  denotes  the  number  of 
shocks  the  component  has  received.  It  is  assumed  that  the  drifting 
parameters  can  be  described  by  a  first-order1  difference  equation 
of  the  form: 

C(.V  -*-!)»  C(,V)  -  u0  -  a,  N  -  a, .V1 - ak ,V* 


reliability  theory  [1],  we  assume  that  this  relationship  is  deter¬ 
mined  by  a  Poisson  process.  Indeed,  this  is  the  unique  point 
process  which  has  the  scaling  properties  required  for  such  an 
application  [3],  Here  the  probabi  ;y  of  N  shocks  occurring  in  the 
time  interval  r  is 

Nm  0,1,2,-  (3) 

where  k  is  a  given  constant  representing  the  average  number  of 
shocks  per  unit  Time.  Therefore.  (Act)  is  the  average  number  of 
shocks  in  the  time  interval  r. 


III.  Estimation  of  Failure  Dynamics 
•and  Lifetime 

In  a  periodic  maintenance  system,  the  performance  of  a  com¬ 
ponent  is  measured  at  each  maintenance  interval  nT.  That  is  to 
say,  (C„  Ci,  C,)  is  the  performance  data  taken  at  mainten¬ 
ance  times  (T.  2T,  ■  •  ■ ,  gT).  The  estimation  problem  can  be  stated 
as.  “Given  performance  data  (C,.  C,,  •  •  • ,  C,),  T  and  fc.  estimate 
the  unknown  constants  (a0.  a,,  -,  a*)  of  the  failure  dynamics." 

Since  it  is  assumed  that  the  system  is  subjected  to  Poisson  shock 
with  constant  Ic.  the  expected  number  of  shocks  in  each  mainten¬ 
ance  interval  is  kT.1  As  such,  if  we  assume  that  Cm  is  the  value  of 
the  component  parameter  at  N  *  mkT,  then  upon  substituting 
Cm  =*  C(mkT)  in  (2),  we  obtain 

"*kT-  1  mkT-  1  mkT- 1 

£  °o/>  +  £  Oij1  4 - +  £  a»/«l-C„ 

)«0  j*0  jmO 

where  m  *  1,  2.  3,  g,  or  in  the  matrix  form: 


C(0)-l.  (1) 

Here  the  coefficients  and  order  of  the  “forcing  polynomial"  are 
assumed  to  be  unknown  and  must  be  estimated  as  part  of  the  fault 
prediction  process.  A  little  algebra  together  with  the  standard 
recursive  formula  for  solving  a  difference  equation  will  reveal  that 

C(.V)  -  1  -  '£  i  «,/.  (2) 

jmO  1*0 

Now.  if  the  tolerance  limit  for  the  component  parameter  is 
normalized  to  C  -  0.  we  may  define  the  lifetime  of  the  component 
to  be  the  smallest  integer  .V  for  which  C(.V)  £  0.  This  integer, 
which  we  denote  by  L  then  represents  the  number  of  shocks 
necessary  to  cause  the  component  to  fail. 

Since  the  failure  model  of  (I)  is  dependent  on  "component 
time."  i.e.,  the  number  of  shocks  the  component  has  received, 
rather  than  real  time,  it  remains  to  define  the  relationship  between 
"component  time"  and  real  time  Following  common  practice  in 

.  '  The  concern  JeNsnoeU  he-ein  cirr?  s\er  without  mndificaiion  to  the  case  *hcre 
fhe  failure  -node!  •*  c.**aractcri/cd  hy  higher  order  difference  equation*  The  fir<t- 
order  ho%oer  suffice*  to  illustrate  the  theory  and  i*  hence  used  throughout 

the  present  paper 


(4) 

Since  the  number  of  data  points  g  is  typically  much  greater  than 
the  order  of  the  polynomial  assumed  in  the  failure  model  h,  it  is 
not  expected  that  (4)  admits  an  exact  solution.  Rather,  we  attempt 
to  solve  for  a  coefficient  vector  .4  which  minimizes  the  error  be¬ 
tween  J  A  and  Z.  In  particular,  if  one  adopts  a  least  squares  error 
criterion,  the  optimal  4  is  given  by 

.4°  m  J~CZ  (5) 

where  J~°  denotes  the  generalized  inverse  of  J  [8],  Indeed,  if  as  is 
typically  the  case,  J  has  full  column  rank,  then  J~a  »  (J'J)~  1 / 
where  t  denotes  matrix  transposition.  As  such,  we  take  A0  = 
col  (do.  a?.  a?)  as  our  estimate  of  the  coefficients  of  the 

difference  equation  characterizing  the  failure  dynamics  of  our 
drifting  parameter  C  as  per  (1). 

To  estimate  the  failure  dynamics  of  a  drifting  parameter,  the 
proper  choice  of  the  order  h  is,  in  general,  quite  difficult  and 
depends  upon  physical,  considerations  and  engineering  exper¬ 
ience  Once  h  is  preselected,  however,  coefficients  to  best  approxi¬ 
mate  the  failure  dynamics  can  be  readily  computed  via  (5).  The 

1  Although  noi  iSeormcallv  necessary  »e  juumr  mat  IT  n  in  integer 
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accuracy  of  the  resultant  estimate,  however,  is  highly  dependent 
on  the  choice  of  the  order  h  and  on  the  number  of  measurements 
which  are  taken  g.  To  find  a  new  set  of  coefficients  for  a  different 
combination  of  h  and  g.  the  entire  calculation  procedure  is 
typically  repeated  from  the  very  beginning,  a  process  which  is 
impractical  in  the  on-line  maintenance  system.  Fortunately,  se¬ 
quential  refinement  schemes  for  obtaining  new  sets  of  coefficients 
without  repealing  the  entire  calculation  can  be  developed  [S],  [8]. 
As  such,  it  is  possible  to  sequentially  update  one's  estimates  of  the 
parameters  a0.  a,,  •  ,  a,  as  additional  measurements  are  taken 
and.  or  to  increase  the  order  of  the  model  for  the  failure  dynamics 
without  repetitious  matrix  inversion.  Our  algorithm  for  estima¬ 
tion  of  the  failure  dynamics  underlying  the  measured  data  may 
thus  be  readily  implemented  on-line  with  the  computational 
power  presently  available  in  today's  microprocessors.  The  matrix 
algebraic  details  of  the  required  sequential  refinement  schemes  are 
straightforward  [5],  [8]  and  readily  available  in  the  literature.  As 
such,  they  will  not  be  repeated  here. 

In  practice,  given  g  measurements  Cj,  Cj,  •  ••.  C,  taken  at  main¬ 
tenance  intervals  T,  2  T,  3T.  one  sequentially  estimates  the 

coefficients  of  the  failure  dynamics  a0,  at.  •  •  • ,  a„,  increasing  h  until 
no  further  error  reduction  is  achieved.  The  resultant  set  of 
coefficients  is  then  used  in  (2)  to  determine  the  component  lifetime 
L  Upon  solving  the  equation,  the  resultant  estimated  lifetime  is 
found  to  be  the  smallest  integer  U  such  that 

l'  I  Cf  *  1-  (6) 

i  ■  o  i  -  o 

Of  course,  if  the  measured  data  is  not  decaying  towards  zero,  i.e.. 
the  component  is  not  failing,  this  inequality  will  have  no  solution, 
in  which  case  we  take  L  to  be  infinite  [4], 

IV  Replacement  Theory 

Although  the  algorithm  outlined  in  the  preceeding  section 
yields  an  "optimal"  estimate  of  the  number  of  shocks  required  to 
cause  failure,  the  time  at  which  the  Lth  shock  takes  place  is  statist¬ 
ical  in  nature,  and  hence,  it  still  remains  to  determine  the  optimal 
(in  an  appropriate  sense)  time  at  which  to  replace  the  component. 
One  such  criterion  is  formulated  in  the  following.  For  this 
purpose,  it  is  assumed  that  L  has  been  computed  to  our  satisfac¬ 
tion  and  we  desire  to  choose  ?  time  T,  at  which  to  replace  the 
component  as  a  function  of  L  Given  L  and  T,  we  denote  the 
resultant  probability  of  on-line  failure  (i.e..  failure  before  T,)  by 
P r.  P, »  1  -  P,  then  denotes  the  probability  that  the  component 
is  replaced  at  time  T,  before  it  fails.  Similarly,  we  let  t /denote  the 
expected  time  to  failure  for  those  components  which  fail  on-line, 
we  let  T  denote  the  expected  time  to  failure  for  all  components, 
and  we  let  T*  denote  the  expected  time  to  failure  for  the  compon¬ 
ents  ifthey  were  operated  to  failure  without  replacement  (i.e., 
T*  m  7" I r. - » )•  Finally,  we  let  fL(t)  denote  the  probability  density 
function  that  the  component  receives  the  Lth  shock  at  time  r, 
given  that  the  component  fails  on-line,  whereas  pAt)  represents  the 
density  function  of  the  Poisson  distribution  with  parameter  (It), 
and  £t(r)  represents  the  corresponding  distribution  function;  i.e.. 


an  explicit  formula  for  determining  an  "optimar  T,  (given  L )  can 
be  derived.  We  begin  with  the  derivation  of  the  explicit  formula 
for  the  various  quantities  involved  in  our  replacement  theory 
Since  a  component  will  be  replaced  by  our  algorithm  if  and 
only  if  it  is  still  operating  at  time  T„  i.e..  if  it  has  not  yet  received  L 
shocks  at  time  T„  the  probability  of  replacement  is  just  the  prob¬ 
ability  of  receiving  less  than  L  shocks  by  time  T,.  We  thus  have  the 
following. 


Property  1: 

Pr  -  EdT,) 

Proof: 

P,~ 

L~l  (lcT\i  t"' 

I  hr*'"’-  I  PAT,) ‘EdT,) 
1-0  1!  1-0 

Property  2: 

Pf  -  1  -  EdT,) 

Property  3: 

f  p,(r)dr  -  (l/fc>(  1  -  Ei,i(T,)). 

'0 

Proof: 

CpMJt-  fr'Me-dr 

■0  -o  l! 


0 

kl  -T- 

I 

*0 


71.1  trk,dt- 


Using  the  identity 


| 'x~e“dx~<r  £  (- ir 


m!  *— r 
(m  —  r) !  o'* 


this  becomes 


•  T,  Li  I  ■(  J-r 

J.  AW*-R«-.S,-,r„.;,^.tr. 


l! 

k* 


,  L  _l! _ e-kT,  y  l!  r'" 

ill  k‘*‘  ,t-0  (i  -  r)!  k'" 

kl  ,r*o  (i  -  r)!  I 

ml  f  lkT'Y I 

kl1  '  X0— i 


(9) 


(10) 


(ID 


f  1-  I  pAV 

Jm  o 


-j{1  -£,.,(7-.)}. 


(12) 


Property  4 


PAO-Ujlf*.  i-  0.1.2.-  (7) 

£«.</)  -  I  PAO-  (8) 

•  •0 

With  the  aid  of  some  elementary  calculus  [9],  Pr,  P„  t /,  and  f, 
as  well  as  their  derivatives  with  respect  to  T„  can  be  computed 
analytically.  As  such,  upon  defining  an  appropriate  cost  measure. 


f  <,t .  Pt-|(0 

W1  l/k(l  -EdT,)) 

Proof:  To  derive  this  conditional  density  function  we  parti¬ 
tion  the  interval  (0.  T,)  into  N  segments  of  length  A  »  T,  IN,  and 
we  compute  the  probability  that  the  Lth  shock  takes  place  in  the 
ith  time  interval  ((i  -  1  )A.  iA).  Since  this  can  be  caused  by  having 
L  -  1  shocks  before  (i  -  1  )A  and  at  least  one  shock  in  the  interval 
((i  -  1  )A.  iAj,  or  by  having  L  -  2  shocks  before  (»  —  1  )A  and  at 
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Hence 

kf  -  L{  1  -  £t..(7;)}  +  kT,EdZ).  (25) 

Thus  by  direct  differentiation 

-  £{Pt(r,»  +  (kT,(-pL.1(T,))  +  EL(Z) 

-  LpL(Z )  -  kZpL-i(Z)  +  El(Z\  (26) 

Since 


LpdZ)-L^e-"- 


m,kT)  W 

[k,,)  (L-  1)!  * 

mkZPL-dZ\ 

(27) 

this  reduces  to 

d(kt) 

d(kZ)  EtT'* 

(28) 

as  required. 

Given  the  above  statistics  Tor  replacement,  on-line  failure,  and 
expected  time  to  failure  of  a  component  with  estimated  lifetime  L 
and  assumed  replacement  time  Z.  we  desire  to  choose  Z  (given  L) 
which  minimizes  some  appropriate  cost  function.  Intuitively,  this 
cost  function  should  represent  both  the  cost  ^  on-line  failure  and 
the  cost  of  wasted  component  lifetime  due  to  replacing  compon¬ 
ents  before  failure  [12],  [13],  We  therefore  adopt  the  cost 
functional 

cost  -  CfP,  +  Cw(k-T  -  kT ).  (29) 

Here  Cf  and  Cw  are  appropriate  weight  factors  representing  the 
cost  of  on-line  failure  and  the  cost  of  component  lifetime  wastage, 
respectively.  Thus  the  first  term  in  the  cost  functional  represents 
t-ie  expected  cost  of  a  failure  (i.e.,  the  probability  of  an  on-line 
failure  times  the  cost  of  such  a  failure),  whereas  the  second  term  in 
the  cost  functional  represents  the  expected  cost  of  wasted  com¬ 
ponent  lifetime  (i.e..  the  expected  lifetime  reduction  times  the  cost 
pier  unit  time  for  such  a  lifetime  reduction,  with  k  serving  as  a 
normalizing  factor). 

To  minimize  the  cost  functional  of  (29),  one  simply  substitutes 
the  values  for  Pf(Z).  T*.  and  T(T,)  computed  in  the  preceding 
pages,  differentiating  the  cost  with  respect  to  kZ  and  setting  it 
equal  to  zero.  This  then  results  in  the  equality  [4] 

0  -C,pL.AZ)-C„EdZ)  (30) 

where  d(P ,)jd(kZ)  is  given  by  property  9  and  d(kt)/d(kT,)  is  given 
by  property  10.  Thus  the  choice  of  an  optimal  T,  (given  L )  is 
reduced  to  the  solution  of  a  single  nonlinear  equation  in  one 
unknown.  The  solutions  of  this  equation  are  plotted  in  Fig.  1  for  a 
number  of  values  of  Land  C//CV.  'ndeed.  it  can  be  readily  shown 
that  (30)  has  exactly  one  solution  for  Z  >  0.  Moreover,  the 
function 

Rrft)  “  CfPL- i (f )  —  CirftlO  (31) 

takes  on  negative  values  for  C  <  r  <  Z  and  positive  values  for 
Z  <  t:  hence  in  an  on-line  maintenance  system  one  need  not  even 
solve  (30).  Rather,  one  simply  evaluates  R(,(t)  at  the  time  of  the 
next  scheduled  maintenance.  If  this  results  in  a  negative  number, 
the  next  scheduled  maintenance  precedes  the  optimal  replacement 
time,  and  hence  we  should  wait  at  least  until  the  next  scheduled 


Fig.  1.  Replacement  time  (kT,)  versus  Lifetime  L  with  different  weight  constant. 

maintenance  (when  we  will  have  more  data)  to  replace  the 
component.  On  the  other  hand,  if  the  evaluation  of  RL(t )  at 
the  next  scheduled  maintenance  time  results  in  a  positive  value, 
the  optimal  replacement  time  will  have  passed  by  the  next 
scheduled  maintenance,  and  hence  the  component  should  be 
replaced  at  the  present  maintenance  interval. 

V.  The  Algorithm 

Summarizing  the  on-line  maintenance  algorithm  resulting  from 
the  above  theory  takes  the  following  form.  At  the  ^th  scheduled 
maintenance  interval  (at  time  gT)  one  measures  the  component 
parameter  C,.  If  C,  is  already  out  of  tolerance,  the  component  is 
simply  replaced,  and  no  further  analysis  is  required.  If.  however. 
C,  is  in  tolerance  (C,  >  0  in  our  notation),  it  is  used  together  with 
the  values  of  the  component  parameter  measured  at  the  previous 
maintenance  intervals  to  estimate  the  dynamics  of  the  failure 
model  for  the  component.  Here  sequential  refinement  schemes 
may  be  used  both  to  include  the  effect  of  C,  on  the  estimates  made 
at  the  (g  -  l)st  maintenance  interval  and  to  increase  the  order  of 
the  polynomial  used  to  represent  the  component  failure  dynamics. 
Once  the  component  failure  dynamics  have  been  satisfactorily 
estimated,  one  evaluates  (31 )  to  estimate  whether  or  not  to  replace 
the  component.  If  RL((g  +  1  )T)  £  0.  the  component  is  replaced, 
whereas  if  RL{(g  +  1)T)  <  0.  the  component  is  not  replaced,  and 
the  system  is  returned  to  service  until  the  next  scheduled 
maintenance. 

VI.  Simulations 

A  computer  simulation  of  an  on-line  periodic  maintenance 
system  based  on  the  above  described  algorithm  was  performed  for 
600  maintenance  intervals  with  a  new  component  replacing  the 
old  component  after  each  replacement  decision  and/or  on-line 
failure  [4],  The  system  was  subjected  to  computer-generated  Pois¬ 
son  shocks  with  constant  km  0.1  shocks  per  hour  and  a  mainten¬ 
ance  interval  of  T  -  30  h.  The  simulation  was  first  run  using 
identical  components  with  L  ■»  28  (expected  lifetime  of  14  main¬ 
tenance  intervals)  and  then  repeated  using  random  components 
and  noisy  data  measurements. 
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TABLE  t 

Total  Replacements  and  Failures  Within  600  Maintenance 
Intervals  for  Different  C,ICw 


C./Cm 

NO.  O*  rootocomortt 

No  of  tenure 

SO 

4 

7 

75 

SO 

1 

100 

52 

2 

ISO 

54 

2 

200 

54 

2 

TABLE  II 

Total  Replacements  and  Failures  Within  600  Maintenance 
Intervals  for  Various  Fixed  Replacement  Strategies 


Constant 
replacement  time 

No.  of  replacement 

No.  of  failure 

•very  6  intervale 

100 

0 

•very  7  interval* 

as 

0 

•vary  S  interval* 

75 

0 

•very  9  interval* 

65 

1 

•very  10  interval* 

59 

1 

•very  11  interval* 

4 

6 

ever*  12  .ntervei* 

39 

11 

TABLE  IV 

Total  Replacements  and  Failures  Within  600  Maintenance 
Intervals  for  Various  Fixed  Replacement  Strategies 
and  the  Algorithm  at  Different  Noise  Levels 


\  level 


intervals 
•very  12 


rneeigoninm 


20%  30% 


NO.  Of 
replace 

No.  Of 
fail 

NO.  Of 

replace 

NO.  ( 
tail 

100 

0 

100 

0 

as 

0 

as 

0 

75 

0 

72 

3 

04 

2 

03 

3 

so 

4 

51 

9 

45 

10 

45 

10 

30 

15 

35 

<0 

SO 

3 

SS 

5 

40%  «0% 


NO  Of 
replace 

NO  Of 
tee 

NO  Of 

replace 

No  O 

fan 

100 

0 

94 

0 

04 

1 

79 

• 

71 

4 

04 

12 

ao 

7 

52 

17 

45 

15 

45 

10 

45 

10 

39 

20 

30 

17 

31 

23 

55 

5 

SO 

11 

TABLE  V 


TABLE  III 


Overall  Cost  with  Different  Methods  and  Different  C,.‘Cw 


t-tty  8  .m*rv«l#  1600  1600  1600  1600  1600 

t-tty  7  HlMrval-  ,  TOM  10M  I0M  I0M  lOM 

•-tty  6  irtftfv*  ■  300  300  900  900  900 

•v try  3  MMnali  636  723  746  796  646 

t—try  to  uitanaM  530  555  560  630  660 

•wry  11  ,M*nau  612  762  312  1  212  1  512 

t-tty  12  .nlayvAl*  750  1025  1300  1650  2400 

1714  iigonm'n  530  471  512  366  766 


Overall  Cost  for  Different  Methods  at  Different  Noise  Levels 


metnoo 

0% 

20% 

30% 

40% 

80% 

every  0  interval* 

1000 

1800 

1600 

1600 

2200 

•very  7  •  ntervei* 

1090 

1098 

1098 
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For  the  case  where  identical  components  were  employed.  Table 
I  gives  the  total  number  of  replacements  and  failures  resulting 
from  the  application  of  the  algorithm  over  the  600  simulated 
maintenance  intervals  with  different  values  of  Cf,'Cw ■  For 
comparison.  Table  II  shows  the  total  number  of  replacements  and 
failures  which  would  have  resulted  from  a  fixed  replacement 
strategy  ranging  from  six  to  twelve  maintenance  intervals.  Since 
the  cost  function  is 

cost  -  CfP,  +  C*(kV  -  kT)  (32) 

the  overall  cost  can  be  expressed  as 

cost  »  (number  of  failures) 

Cir 

+  0.1  (2S0*  (number  of  components  used)  -  12000). 

(33) 

The  overall  costs  resulting  from  tne  application  of  our  algorithm 
and  the  various  fixed  replacement  schedules  may  be  computed 
from  the  data  in  Tables  I  and  II  The  resultant  costs  for  different 
values  of  Cr  C»  are  given  in  Tabic  III 

Note  that  since  L  =  2S  for  each  component  in  this  simulation, 
an  optimal  replacement  strategy  of  approximately  ten  mainten¬ 


ance  intervals  can  be  computed  from  (30)  without  estimating  L 
As  such,  it  is  not  surprising  that  this  fixed  replacement  strategy 
resulted  in  lower  costs  than  the  algorithm.  It  should,  however,  be 
noted  that  the  algorithm  did  not  have  the  advantage  of  an  a  priori 
knowledge  of  L  and  yet  it  still  outperformed  all  of  the  fixed  re¬ 
placement  strategies  except  the  optimal  strategy  (that  is,  optimal 
once  L  is  known). 

In  our  second  simulation,  random  noise  was  added  to  the  data 
to  simulate  both  the  effects  of  imperfect  measurement  and  the 
effect  of  components  with  random  failure  characteristics.  Various 
simulations  were  run  as  before  for  600  maintenance  intervals  each, 
with  k  *  0.1  and  T  «  20  and  with  noise  levels  ranging  between  20 
and  60  percent  of  the  tolerance  interval.  The  results  of  these  simu¬ 
lations  are  given  in  Tables  IV  and  V  Except  for  a  single  case, 
which  we  believe  to  be  anomolous.  the  algorithm  outperformed 
any  fixed  replacement  strategy. 

VII.  Conclusion 

In  the  proceeding  pages,  we  have  described  a  curve  fitting  algor¬ 
ithm  for  the  prediction  ol  failures  in  analog  devices  The  algorithm 
was  tested  in  a  variety  of  situations  and  found  to  be  surprisingly 
effective  in  predicting  failures  with  relatively  little  wastage  of  com¬ 
ponent  lifetime  and  on-line  failure  cost. 
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An  Approach  to  Built-In  Testing 


R.  SAEKS.  I'cllott.  |M;r 
Texas  Tech  University 


Abstract  , 

The  architecture  anil  justification  tor  an  approach  to  built-in  test¬ 
ing  (8ITI  in  electronic  circuits  anil  system,  is  presented.  The  pro¬ 
posed  system  is  capable  of  on-line  fault  detection  and  prediction 
up  to  the  shop  replaceable  assembly  <SKA>  level  and  is  designed 
to  interlace  with  external  automatic  test  equipment  (ATE)  for 
off-line  fault  diagnosis  within  the  SKA.  The  constituent  parts  of 
the  BIT  system  have  been  extensively  simulated  and  the  approach 
appears  to  be  suitable  tor  hardware  implementation  both  with 
respect  to  computational  and  economic  considerations. 


I.  Introduction 

An  approach  to  built-in  testing  (BIT)  for  electronic  cir¬ 
cuits  aiui  systems  is  outlined.  The  approach  assumes  a  two- 
level  hierarchical  architecture  in  which  a  central  micro¬ 
processor  controls  and  coordinates  the  testing  of  a  number 
of  subsystems  each  of  which  has  built-in  test  equipment 
(BITE)  such  as  sensors  and  a  nanoprocessor  for  preprocess¬ 
ing  the  test  data  prior  to  transmission  to  the  central  micro¬ 
processor.  The  approach  allows  for  on-line  fault  detection 
and  prediction  up  to  the  level  of  a  shop  replaceable  assembly 
(SRA)  and  off-line  fault  diagnosis  within  the  various  SRA's 
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Section  II  is  devoted  to  a  description  ol  the  Bl  I  system 
architecture.  This  two-lcvcl  structure  has  been  formulated 
to  be  applicable  either  at  the  printed  circuit  board  level  in 
which  the  SRA's  represent  individual  devices  (1C  chips, 
elementary  components,  etc.)  or  at  the  level  of  an  entire 
electronics  system  in  which  the  SRA's  represent  printed 
circuit  boards. 

Section  III  is  devoted  to  a  study  of  the  fault  diagnosis 
problem.  In  cither  the  case  of  a  linear  or  nonlinear  circuit 
it  is  shown  that  this  problem  can  be  reduced  to  the  solution 
of  a  set  of  simultaneous  nonlinear  algebraic  equations.  In 
the  proposed  BIT  architecture  a  linearization  of  these  equa¬ 
tions  is  used  on-line  for  fault  detection  and  prediction 
whereas  the  full  set  of  nonlinear  equations  are  used  off-line 
for  fault  diagnosis  within  the  SRA. 

Two  algorithms  for  fault  prediction  are  described  in 
Section  IV.  Both  are  essentially  curve  fitting  algorithms 
implemented  on  the  central  test  microprocessor  in  a  time 
multiplexed  mode.  Here  the  microprocessor  periodically 
receives  test  data  from  the  various  SRA's  and  extrapolates 
tins  data  to  determine  whether  or  not  the  SRA  is  likely  to 
fail  in  the  near  future.  The  final  section  of  the  paper  is 
devoted  to  a  discussion  of  the  concept  of  self-testing,  in 
particular,  the  possibility  of  self-testing  in  a  predictive 
mode. 

At  the  tunc  of  this  writing  the  approach  to  BIT  described 
has  yet  to  be  fully  implemented.  It  is.  however,  predicated 
on  several  years  of  research  in  the  area  and  each  of  its  con¬ 
stituent  subsystems  have  been  extensively  simulated  [17, 

18,  19) .  At  the  present  time  the  hardware  implementation 
of  the  various  algorithms  is  under  investigation  [IS,  16] 
and  it  is  hoped  that  an  entire  BIT  system  will  be  in  operation 
in  the  near  future. 

II.  BIT  Architecture 

The  basi :  BIT  architecture  is  a  two-level  hierarchical 
structure  illustrated  in  Fig.  I.  Intuitively,  the  overall  system 
may  represent  a  printed  circuit  board  while  the  subsystems 
represent  various  SRA's  such  as  integrated  circuits,  power 
supplies,  vaccuum  tubes,  etc.  Alternatively,  the  overall 
system  may  represent  an  entire  electronics  system  with  t he 
SRA’s  being  us  constituent  printed  circuit  boards.  In 
either  case  the  SRA's  may  be  throw-away  units  or  units 
intended  for  off-line  repair  with  BITE  designed  to  detect 
and/or  predict  faults  in  the  SRA.  For  those  units  intended 
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Fig.  1.  Tvvo-tev«l  9 IT  architecture. 

for  off-line  repair  the  BITE  may  also  be  used  as  an  inter¬ 
face  with  an  external  test  stand  but  will  not  be  capable  of 
isolating  the  failure  within  the  SRA. 

This  structure  is  motivated  by  severaf years  of  basic  re¬ 
search  into  the  relative  computational  complexity  of  the 
three  fundamental  problems  of  fault  analysis:  fault  detec¬ 
tion,  fault  diagnosis,  and  fault  prediction  [9J .  The  latter 
problem  requires  considerable  computational  power  [18, 

19|  but  need  only  be  carried  out  at  widely  spaced  test 
intervals,  say  one  test  per  hour  (minute,  second,  ?).  As 
such,  a  single  central  microprocessor  can  be  time-multiplexed 
through  the  testing  of  a  large  number  of  subsystem  para¬ 
meters  thereby  achieving  the  required  computational  power 
for  the  fault  prediction  algorithm  while  still  holding  the 
amount  of  dedicated  test  equipment  within  reasonable 
bounds  [10] . 

While  fault  diagnosis  can  be  carried  out  with  consider¬ 
able  success  the  process  requires  significant  computational 
power  (at  least  a  mini  by  today’s  standards)  and  lengthy 
computer  runs  [7,  11,  17] .  As  such,  fault  diagnosis  within 
an  SRA  is  done  off-line  on  an  external  test  stand  contain¬ 
ing  the  required  mini  (or  maxi)  computer.  Each  SRA,  how¬ 
ever,  will  include  sufficient  BITE,  say  a  nanoprocessor. 
to  collect  and  condition  test  data  on  the  SRA  to  be  period¬ 
ically  communicated  td  the  central  microprocessor  for 
purposes  of  fault  prediction  and  detection. 

Fortunately,  both  of  these  endeavors  may  be  achieved 
using  a  model  of  the  SRA  linearized  about  its  nominal 
values  and  hence  can  be  implemented  with  relatively 
little  computational  power  built  into  the  SRA  [  1 2  J .  For 
fault  prediction  in  particular,  one  is  interested  in  track¬ 
ing  various  internal  parameters  of  the  SRA  as  they  drift 
irom  nominal  to  their  tolerance  limit.  Since  the  toler¬ 
ance  interval  is  typically  only  a  few  percent  this  can  be 
achieved  with  a  linearized  model.  For  catastrophic  errors 
a  linearized  model  may  be  used  to  detect  failures  even 
though  it  is  not  sufficiently  accurate  for  fault  diagnosis. 

As  such,  the  BITE  within  an  SRA  may  be  kept  within 
reasonable  bounds  while  still  delivering  sufficient  data  to 
the  central  microprocessor  for  its  fault  prediction  and 
fault  detection  tasks.  If  needed,  fault  diagnosis  within  an 
SRA  can.  however,  be  done  off-line  with  the  BITE  simply 
serving  as  an  interface  between  the  SRA  and  an  external 
test  stand. 


A  final  aspect  of  the  BIT  architecture  is  the  communi¬ 
cation  link  between  the  SRA's  and  the  central  micropro¬ 
cessor.  Here  one  desires  to  keep  the  wiring  between  the 
SRA’s  and  the  central  microprocessor  at  a  minimum  and 
simultaneously  have  all  data  transmitted  to  the  central 
microprocessor  in  a  uniform  format  to  |iermit  intcichange- 
ability  of  component  parts  within  the  system.  Although 
the  details  of  this  communications  link  have  yet  to  be 
formalized,  the  existence  of  an  active  computing  capa¬ 
bility  in  each  SRA  gives  one  considerable  flexibility.  As 
such,  we  believe  that  it  will  be  possible  to  work  with  a 
single  test  bus  [I6J .  Here  the  central  microprocessor 
requests  data  from  the  individual  SRA’s  by  transmitting 
a  signal  on  the  bus.  This  signal  is  received  by  the  built- 
in  nanoprocessor  in  the  SRA  which,  in  turn,  transmits 
appropriately  conditioned  test  data  back  to  the  central 
microprocessor  on  the  same  bus. 

The  BIT  architecture  just  described  would  seem  to 
achieve  most  of  the  lequirements  for  a  BIT  system: 

1 )  continuous  on-line  fault  prediction  and  detection  to 
an  SRA  is  achieved; 

2)  the  system  includes  an  interface  for  off-line  fault 
diagnosis  within  an  SRA; 

3)  dedicated  test  equipment  represents  a  small  percen¬ 
tage  of  the  total  system; 

4)  busing  is  minimized  and  test  data  is  transmitted  to 
the  central  microprocessor  in  a  uniform  format 
thereby  facilitating  component  interchangeability. 

III.  Fault  Diagnosis 

For  the  purposes  of  doing  fault  diagnosis  we  work  with 
a  component  connection  model  for  the  circuit  or  system 
tinder  test  which  takes  the  form 

b,  r  ■  1.2 . n  (1) 

j  -  L  ii  b  +  L  a  ii 

\  s  Litb  +  LKu  (2) 

in  the  frequency  domain  [6.  II,  12],  Here  Z,(s.r)  is  the 
transfer  function  of  the  ith  circuit  or  system  component 
where  R  =  col(r,)  is  the  vector  of  unknown  component 
parameters  and  s  is  the  complex  frequency  variable.  Typi¬ 
cally.  the  unknown  component  parameters  take  the  form 
of  amplifier  gams  and  cutoff  frequencies,  pole  and  zero 
positions,  resistances,  inductances,  etc.  In  particular,  it 
is  assumed  that  enough  parameters  are  employed  to  com¬ 
pletely  characterize  the  performance  of  the  device.  The 
/-</  arc  known  connection  matrices.  j=col(a,)  and 
b=col(i>()  are  composite  vectors  of  component  inputs  and 
outputs,  respectively,  and  u  and  v  are  the  test  input  and 
output  signals,  respectively.  In  the  nonlinear  case  the  com¬ 
ponent  equations  are  replaced  by  the  state  models 

xi  s)\  <V,-r> 

*,"*,(*,.  */•')•  '*>•2 . n  (3) 
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with  the  connection  equations  remaining  as  in  (2).  Al- 
though  these  component  connection  models  tor  a  circuit 
or  system  arc  nonclassical  they  arc  widely  used  in  large- 
scale  system  simulation  and  computer-aided  circuit  desi.cn 
and  arc  readily  amenable  to  the  “computer  speed-up  tech¬ 
niques"  developed  for  these  applications  [  1 2 1 .  A?  such, 
they  are  ideally  suited  for  the  fault  diagnosis  problem. 

Combining  (1)  and  (2)  yields  the  fault  diagnosis  equa¬ 
tion  [7] 

|1-  7.(s.r)l.xl]  xZ(:.r)L»  (4) 

where  Z(s,r)  *  diag[Z;(s,r)J  and  Sm  is  the  measured  trans¬ 
fer  function  relating  the  input  test  signal  u  to  the  output 
test  signal  y.  The  solution  of  the  fault  diagnosis  problem 
therefore  amounts  to  the  solution  of  (4)  for  the  parameters 
vector  r,  given  Sm  and  the  connection  matrices.  Although 
it  is  possible  to  give  an  analytic  description  of  all  possible 
solutions  to  this  equation  [12.  13]  given  any  fixed  value 
for  the  complex  frequency  variable  s  in  a  “real  world" 
situation  the  number  of  unknowns  greatly  exceeds  the  num 
ber  of  equations  and.  as  such,  the  analytic  representation  of 
the  solution  manifold  proves  to  be  of  little  value.  This 
difficulty  is  alleviated  via  a  multifrequency  diagnosis 
algorithm  wherein  one  writes  the  set  of  simultaneous 
equations: 


S(s,.r)  *  /.2; 

:  +/-n 

[1  -  Z(j  t  ,r)L  i  \  ] 

1  Z(S\.r)L\i 

S(si.r)=  LYl 

1  ♦  Lj, 

[1  -Z(s,.r)/.n] 

1  *  Z{S}.r)L  i; 

S{tks)*La+Li ,  [1  ~Z(s^)-,2<sjt.r)Z.(J  (5) 

where  k  different  complex  frequencies  are  used  in  (4) 
simultaneously.  The  interesting  and  somewhat  surprising 
result  is  that  the  additional  equations  in  (Si  may  be  inde¬ 
pendent  thus  increasing  the  number  of  fault  diagnosis 
equations  without  increasing  the  number  of  its  unknowns 
[7]  While  the  set  of  simultaneous  equations  (5)  often  has 
a  unique  solution,  no  analytic  solution  technique  is  known 
and  we  must  resort  to  time-consuming  numerical  solution 
procedures  carried  out  off-line. 

Although  the  multifrequency  fault  diagnosis  equations 
of  (5)  do  not  admit  an  analytic  solution  their  numerical 
solution  can  be  significantly  speeded  up  by  careful  anal¬ 
ysis  of  the  equations.  In  particular,  a  little  algebra  [6,  12] 
will  reveal  that 

USm/Urj  •Lu  (l  -  Z(s..  r)L , ,  ]* '  [JZ(srr)ldr.\ 

*  0  *l.n  |1  -ZU/ll.,i]-l)iii  (6) 

showing  that  one  can  compute  the  partial  derivatives  required 
for  the  numerical  solution  to  (5)  analytically.  Moreover,  if 
one  observes  that  tlielnverse  matrix  required  to  compute 
the  partial  derivatives  in  (6)  is  precisely  the  same  inverse 


matrix  required  to  evaluate  the  multtfrcquency  fauli  diag¬ 
nosis  equations  (5)  it  is  seen  that  the  partial  derivative 
information  is  obtained  at  virtually  no  computational  vost 
above  that  required  for  the  evaluation  of  the  equations. 

In  a  similar  vain  one  can  reduce  the  computation  required 
to  compute  the  inverses  at  different  complex  frequencies 
by  integrating  the  differential  equation 

d  [1  -Z(ss)Lu\  ■'/«/*■  II  - Z(ij)L\i  J ‘  1 

•  [JZ(s.r)lds\l.u  [1  -  Z(s.r)L,i  ]  ~*  (7) 

using  the  inverse  computed  at  one  particular  frequency  as  a 
starting  point  [2,  14] .  Although  of  extremely  high  dimen¬ 
sion  this  equation  is  easily  integrated  without  the  require¬ 
ment  for  matrix  inversions.  With  the  aid  of  these  observa¬ 
tions  it  is  possibc  to  carry  out  an  entire  iteration  of  a 
Newton-Raphson  algorithm  for  the  solution  of  the  multi- 
frequency  fault  diagnosis  equations  with  the  aid  of  only 
a  single  matrix  inversion. 

Although  one  does  not  have  a  “neat"  set  of  equations 
such  as  those  described  above  for  the  solution  of  the  fault 
diagnosis  problem  in  a  nonlinear  circuit  or  system,  surpris¬ 
ingly  similar  computational  techniques  can  be  invoked  in 
the  nonlinear  case.  The  key  to  these  techniques  is  the 
replacement  of  the  multifrequency  information  of  the 
linear  case  by  a  family  of  integral  performance  measures 
on  the  test  signals  u  andy.  These  play  exactly  the  same 
role  in  nonlinear  fault  diagnosis  as  played  by  the  frequency 
information  in  the  linear  case,  allowing  one  to  formulate 
multiple  independent  fault  diagnosis  equations  from  the 
same  test  signals. 

In  the  nonlinear  case,  the  sparse  tableau  algorithm  |3. 

1  2]  is  used  to  evaluate  the  fault  diagnosis  equations  at  each 
iteration  of  a  Newton-Raphson  algorithm.  As  in  the 
linear  case  this  algorithm^allows  one  to  compute  the  deriva¬ 
tive  required  for  the  Newton-Raphson  algorithm  with 
essentially  no  additional  computational  cost  above  that 
required  for  the  evaluation  of  the  equations  [3,4.  12]. 

It  is  possible  to  obtain  significant  computational  gains  in 
the  solution  of  the  fault  diagnosis  equations  in  the  non¬ 
linear  case  as  well  as  in  the  linear  case,  by  optimally  exploit¬ 
ing  the  computational  efficiencies  inherent  in  the  sparse 
tableau  formulation  for  an  electronic  circuit  or  system. 

Even  using  computational  efficiencies  which  are  possible 
for  solving  the  fault  diagnosis  equations,  this  method  is 
still  long  and  tedious  and  not  well  suited  to  on-line  imple¬ 
mentation  in  a  BIT  system.  It  is  thus  recommended  that 
linearization  of  the  fault  diagnosis  equations  be  used  in¬ 
stead.  Although  far  less  accurate  than  the  solution  of  the 
full  set  of  fault  diagnosis  equations  [12] ,  we  believe  that  in 
the  context  of  the  previously  described  BIT  architecture, 
linearization  of  the  fault  diagnosis  equations  will  prove  to 
be  viable.  From  the  point  of  view  of  fault  prediction  one 
is  interested  only  in  tracking  the  unknown  parameter  vec¬ 
tor  r  from  ns  nominal  value  to  its  tolerence  limit  (a  few  per¬ 
cent  deviation  from  nominal).  This  is  a  region  in  which 
the  solution  of  the  linearized  fault  diagnosis  equations 
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should  be  quite  accurate.  On  the  other  hand,  if  a  cata¬ 
strophic  fault  occurs,  solution  of  the  linearized  equations 
will  detect  the  fault  though  it  may  fail  to  accurately  diag¬ 
nosis  it.  In  this  case,  however,  the  linearized  test  data  and 
its  associated  BITF.  may  be  employed  as  an  interface  be¬ 
tween  the  SRA  and  an  external  test  stand.  As  such,  the 
use  of  linearized  fault  diagnosis  equations  will  suffice  in 
the  context  of  our  BIT  architecture. 

From  the  point  of  view  of  on-line  analysis  in  a  BIT 
system  the  solution  of  the  linearized  fault  diagnosis  equa¬ 
tions  is  computationally  reasonable.  Since  the  linearization 
is  done  about  the  nominal  value,  it  may  be  precomputed 
[via  (6)  in  the  linear  case  and  the  corresponding  equation 
in  the  nonlinear  case]  and  its  inverse  may  be  precomputed. 
Thus,  the  implementation  of  an  algorithm  for  the  solution 
of  the  linearized  fault  diagnosis  equations  requires  only  a 
single  matrix  multiplication,  the  matrix  having  been  pre¬ 
computed  off-line  and  stored  in  a  read-only  memory. 

IV.  Fault  Prediction 

In  the  context  of  the  previously  described  BIT  archi¬ 
tecture  the  primary  role  of  the  central  microprocessor  is 
to  periodially  collect  data  from  the  individual  SRA's  char¬ 
acterizing  their  internal  parameter  vectors  r.  This  data  is 
then  used  to  detect  and  predict  failures  of  the  SRA.  When 
a  failure  is  detected,  the  central  microprocessor  signals 
this  fact  and  the  SRA  is  replaced  and/or  taken  to  an  exter¬ 
nal  test  stand  for  repair.  If  no  failure  is  detected,  the  role 
of  the  central  microprocessor  is  to  compare  the  present 
data  with  previously  measured  values  in  an  endeavor  to 
predict  whether  or  not  failure  is  imminent.  In  this  in¬ 
stance  predicted  failure  of  the  SRA  would  be  signaled  in 
an  effort  to  replace  the  Jevice  before  its  actual  on-line 
failure  [20] . 

For  any  particular  device  one  can  collect  statistical 
data  on  which  to  base  a  fault  prediction  algorithm.  How¬ 
ever,  in  a  practical  BIT  setting  where  the  same  fault  pre¬ 
diction  algorithm  is  multiplexed  through  the  testing  of 
many  different  SRA's.  it  is  necessary  to  use  an  algorithm 
which  is  independent  of  the  specific  properties  of  the 
parameter  under  test.  As  such,  for  our  BIT  system,  we 
expect  to  employ  a  curve  fitting  algorithm  [20] .  Although 
less  accurate  than  a  statistically  based  algorithm,  we  have 
shown  by  simulation  [18,  19]  that  such  an  algorithm  can 
be  employed  as  a  satisfactory  fault  predictor.  Such 
algorithms  are  computationally  simple  thus  permitting  a 
single  central  microprocessor  to  be  multiplexed  through 
the  testing  of  a  large  number  of  SRA's  [15].  Moreover, 
if  one  assumes  that  the  data  delivered  by  the  SRA  to  the 
central  microprocessor  has  been  uniformly  normalized, 
the  fault  prediction  algorithm  will  be  completely  indepen¬ 
dent  of  the  parameter  under  test.  As  such,  one  is  in  a 
position  to  completely  standardize  the  central  micro¬ 
processor  in  a  BIT  system  so  that  changes  in  an  SRA  do 
not  demand  corresponding  changes  in  the  fault  predic¬ 
tion  algorithm. 

Over  the  past  several  years  we  have  investigated  sev- 


era 1  approaches  to  the  fault  prediction  problem  [5,  IS,  16, 
18,  19,  20] .  The  first  is  extremely  naive  but  has  yielded 
surprisingly  effective  results  in  simulation  [18,  19,  20] . 
Basically,  one  collects  data  at  periodic  intervals,  fits  the 
data  with  a  second-order  polynomial,  and  solves  the  qua- 
dradic  equation  to  estimate  the  time  at  which  the  parameter 
will  go  out  of  tolerance.  The  success  of  this  algorithm  is 
due  to  the  fact  that  one  is  not  really  interested  in  the 
accuracy  of  the  failure  time  estimate  but  only  the  accuracy 
of  the  binary  decision  (based  on  this  estimate)  whether 
or  not  to  replace  the  SRA.  Moreover,  this  binary  decision 
is  only  made  when  failure  is  expected  in  the  near  future, 
a  region  of  time  in  which  a  polynomial  extrapolation  is 
reasonably  accurate;  i.e..  if  failure  is  estimated  to  take 
place  in  3  years  even  if  the  estimate  is  off  by  90  percent, 
the  decision  not  to  replace  the  SRA  at  this  time  will  still 
be  correct. 

A  fault  prediction  algorithm  based  on  the  above  men¬ 
tioned  second-order  polynominal  extrapolation  has  been 
extensively  studied  by  Tung  and  Saeks  on  some  10  000 
complete  simulated  operations  of  the  algorithm  [18,  19] . 
Most  of  these  simulations  were  carried  out  on  artificial 
data  generated  by  a  library  of  special  functions  to  which 
a  noise  term  was  added.  These  special  functions  included 
some  highly  complex  nonmonotonic  curves.  Additionally, 
curves  based  on  the  empirical  drift  formula  for  thin  film 
resistors  were  studied  [R(r)  =  At*  where  a  lies  between 
0.3  and  0.5]  [18,19],  In  both  cases,  random  noise  with 
amplitudes  of  up  to  25  percent  of  the  tolerance  interval 
was  added  to  the  data.  The  result  of  these  simulations, 
which  we  believe  to  represent  3n  environment  which  is 
more  extreme  than  the  real  world,  was  that  99.5  percent 
of  all  SRA's  were  replaced  before  on-line  failure  at  a  cost 
of  about  10  percent  of  their  lifetime. 

At  the  present  time  a  somewhat  more  sophisticated 
fault  prediction  algorithm  is  under  development  [5] . 

This  is  still  essentially  3  curve  fitting  algorithm  though 
one  in  which  a  failure  model  (founded  in  modern  reliability 
theory  [  1  ] )  is  employed.  The  basic  idea  for  this  algorithm 
is  as  follows.  The  drifting  SRA  parameter  r  is  assumed  to 
satisfy  a  difference  equation 

r(*+  l)*r  (*)+/(*)  W 

where  the  "component  time"  k  represents  the  number  of 
shocks  the  SRA  has  received  (e.g.,  switching  processes, 
electrons  boiling  off  a  cathode,  etc.).  The  relation  between 
component  time,  i.e.,  the  number  of  shocks  received,  and 
real  time  is  assumed  to  be  a  Poisson-distributed  random 
variable  in  which  the  probability  of  the  SRA  receiving  n 
shocks  in  a  time  interval  of  length  t  is 

P„  (0  -(«?«■' efMa_  (9) 

It  is  assumed  that  the  value  of  the  parameter  r  is  known 
lor  a  fixed  set  of  points  in  real  time;r(fi  )•  ),— Aim  )• 
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Using  this  data  we  desire  to  estimate  the  unknown  failure 
dynamics  f\k)  for  the  SRA  parameter.  This  is  then  used 
in  (8)  to  compute  the  number  of  shocks  required  to  cause 
failure,  i.e.,  the  smallest  value  of  k  for  which  r(Jt )  is  out 
of  tolerance.  Finally,  this  estimate  is  used  to  compute 
the  optimal  real  time  at  which  to  replace  the  SRA  to 
minimize  the  cost  functional 

J*CjPf+cwW.  (10) 

Here  Pf  is  the  probability  of  on-line  failure,  W  is  the  average 
percentage  of  SRA  lifetime  which  is  wasted  by  replacing 
the  SRA  before  its  actual  failure,  and  cy  and  cw  are  weight¬ 
ing  factors. 

Note  that  the  implementation  of  the  Poisson-shock- 
based  fault  prediction  algorithm  just  .described  requires 
that  we  deal  simultaneously  with  two  unknown  phenomena: 
the  failure  dynamics  f[k),  and  the  random  relationship  be¬ 
tween  “real  time"  and  "component  time”  given  by  the 
Poisson  distribution.  Although  the  required  analysis  is 
complex,  a  surprisingly  tractable  (and  optimal  in  an  appro¬ 
priate  sense)  fault  prediction  algorithm  can  be  formulated. 
The  properties  of  the  Poisson  distribution  are  used  to  esti¬ 
mate  the  number  of  shocks  which  the  SRA  has  received  in 
the  time  intervals  [r,,  rf_|]  ,/*l,2,...,m.  in  combination 
with  a  generalized  inverse  algorithm  to  estimate  f\k).  f\k) 
is  approximated  by  a/th-order  polynomial  and  one  must 
compute  the  generalized  inverse  of  an  m  by  j  matrix.  For¬ 
tunately.  the  algorithm  is  ideally  suited  to  a  sequential 
least  squares  technique  [8]  and  no  matrix  inversions  need 
be  earned  out  on-line.  Once /(At)  has  been  estimated  to  a 
satisfactory  level  ot  accuracy  (by  increasing  the  order  of 
the  approximating  polynomial  until  the  estimation  error 
is  reduced  to  a  prescribed  level),  it  is  used  with  (8)  to  com¬ 
pute  the  number  of  shocks  required  for  the  parameter  to 
go  out  of  tolerance.  Finally,  this  value  is  used  in  conjunc¬ 
tion  with  the  Poisson  distribution  to  determine  the  opti¬ 
mal  real  time  at  which  to  replace  the  SRA.  Although 
apparently  complex,  this  latter  optimization  can  be  reduced 
by  analytic  techniques  to  the  solution  of  a  single  nonlinear 
equation  in  one  variable  (5 1 .  As  such,  the  entire  fault 
prediction  algorithm  may  be  easily  implemented,  on-line, 
in  a  BIT  system.  Unlike  the  second-order  curve  fitting 
algontiim,  the  Poisson  shock  algorithm  for  fault  predic¬ 
tion  is  still  under  development  and  its  simulation  on  real 
world  data  is  just  beginning. 

From  the  point  of  view  of  our  BIT  architecture  where 
the  central  microprocessor  is  dedicated  exclusively  to  the 
fault  prediction  job  (plus  bookkeeping  and  control  of  the 
test  communications  link),  we  anticipate  little  difficulty  in 
implementing  either  of  our  fault  prediction  algorithms. 

The  key  to  the  viability  of  the  concept,  however,  is  to  make 
the  algorithm  fast  enough  so  that  a  single  central  micro¬ 
processor  can  be  tune-multiplexed  to  test  a  large  number 
of  parameters.  In  an  effort  to  verify  the  feasibility  of  such 


an  approach,  we  are  presently  in  the  process  of  implement¬ 
ing  the  second-order  curve  fitting  algorithm  on  an  F8  micro¬ 
processor  [15] .  Although  the  implementation  has  yet  to 
be  completed,  most  of  the  subprograms  have  been  written 
and  tested  and  it  appears  that  the  program  will  require 
about  500  bytes  of  memory  and  execute  in  about  30  ms. 

As  such,  the  central  microprocessor  would  be  able  to  cycle 
through  the  testing  of  about  2000  parameters  at  1  -min 
intervals. 


Our  purpose  in  the  preceding  has  been  to  outline  an 
approach  for  designing  a  BIT  system  applicable  to  elec¬ 
tronic  circuits  and  systems.  Although  not  yet  implemented 
in  hardware,  each  of  the  constituent  parts  of  the  BIT  sys¬ 
tem  has  been  extensively  simulated  and  we  believe  that  a 
hardware  implementation  is  feasible  both  computationally 
and  economically.  At  the  present  time,  we  are  implement¬ 
ing  ihe  polynomial  curve  fitting  algorithm  for  fault  predic¬ 
tion  in  hardware  and  are  in  the  preliminary  stages  of 
implementing  the  entire  system  in  a  high-voltage  power 
supply. 


V.  Self-Testing 

An  interesting  side  effect  of  running  a  BIT  system  in  a 
predictive  mode  is  that  it  opens  the  possibility  of  reliable 
self-testing.  The  key  observation  is  that  to  do  fault  pre¬ 
diction  in  a  digital  device  one  must  test  analog  parameters 
such  as  rise  time,  power  supply  voltage,  clock  rate,  pulse- 
widths,  etc.,  since  digital  parameters  are  either  right  or 
wrong  and  have  no  gray  region  from  which  to  extrapolate 
trends.  One  may  therefore  use  a  microprocessor  to  pre¬ 
dict  its  own  failure  by  extrapolating  the  values  of  its 
analog  parameters.  As  long  as  the  prediction  is  made 
before  these  parameters  go  out  of  tolerance,  the  digital 
performance  of  the  microprocessor  is  still  exact.  The 
point  is  that  in  a  predictive  mode  the  microprocessor  is 
still  working  at  the  time  it  predicts  its  own  failure  and 
hence  may  be  used  reliably  in  a  self-testing  mode.  Of 
course,  once  the  analog  parameters  of  the  microprocessor 
have  exceeded  their  tolerance  limits,  it  may  no  longer  be 
trusted  as  a  digital  signal  processor  and  hence  the  device 
cannot  be  used  to  diagnose  its  own  faults  after  failure. 

Although  the  self-testing  concept  just  described  is 
purely  conceptual  and  has  yet  to  be  implemented  or  even 
simulated,  it  is  indicative  of  the  potential  of  fault  predic¬ 
tion  in  a  BIT  system.  Indeed,  if  one  can  reliably  predict 
failure  before  it  actually  takes  place,  such  concepts  as 
self-repair  move  into  the  realm  of  feasibility,  since  at  the 
time  a  replacement  decision  is  made  the  device  under  test 
is  still  working. 


Vt.  Conclusions 
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Abstract 


A  fault  analysis  algorithm  appropriate  for  time  varying  and  nonlinear  systems. 
Is  developed.  The  algorithm  essentially  constructs  an  observer  for  a  nonlinear 
system  that  Is  Intimately  related  to  the  system  under  test. 


INTRODUCTION 

Given  enough  time  and  comoutlng  capability  brute 
force  searches  will  Identify  possible  fault  sets. 
The  real  problem  in  fault  analysis  Is  to  con¬ 
struct  algorithms  that.  In  some  sense,  locate  the 
fault  sets  "efficiently".  "Efficiently",  in  this 
context,  means  that  the  fault  Isolation  must  be 
done  relatively  quickly  and  with '11ml ted -on  site 
computing.  Such  techniques  have  been  develooed 
1 1  ncjr  tlrrs  invsrimt  di-i *5 1 

systems.  These,  however,  make  heavy  use  of 

the  defining  pronertles  for  these  systems,  and 
don't  generalize.  The  purpose  of  this  paper  is 
to  show  that  an  observer  for  an  appropriate  non¬ 
linear  differential  eauation  can  be  utilized,  on 
line,  to  determine  the  values  of  the  system  para¬ 
meters.  A  technique,  based  on  ootimal  control 
theory,  for  constructino  such  observers  is  also 
presented. 

OBSERVERS  AMO  FAULT  ANALYSIS 

Consider  testing  a  system  that  Is  described  by 
the  nonlinear  state  equations 

*1  •  f(x.j  ,a,u,t) 

X  •  g(*j) 

where  is  the  dynamical  state  vector,  a  is  the 


vector  of  parameters  to  be  estimated  (they  are 
assumed  constant  over  the  test  time),  and  u  Is  the 
input  used  In  the  test  procedure.  If  we  want  to 
estimate  a  we  need  to  include  it  in  the  state  vec¬ 
tor,  i.e.,  we  want  to  build  an  observer  for  the 
augmented  differential  equation 

*1  f(xj ,a,u,t) 

8 

a  0 

y*  g(*i)-  N 


If  it  is  possible  to  build  an  observer  that 

will  observe  the  subvector  a  we  have  solved  the 

s 

fault  analysis  problem.  It  would  then  be  neces¬ 
sary  to  justify  our  solution  in  terms  of  time  and 
computation  requirements. 


AM  OBSERVER  DESIGN 

We  chose  to  design  an  observer  with  the  following 
structure 


f(x,a,u,t) 


♦  H(y  -  y)  (H  time  invariant) 


y  -  "(i,). 

we  term  such  an  observer  as  a  Model  reference 
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linear  time  invariant  observer.  The  term  "linear 
time  Invariant"  is  used  because  the  residuals 
enter  in  a  linear  time  invariant  fashion.  The 
problem  is  now  that  of  choosing  H.  To  avoid  in¬ 
volved  stability  considerations  (at  least  initial¬ 
ly)  we  choose  H  so  that  it  minimizes  the  follow¬ 
ing  function 

*• 

J(H)  *  J[(x,-x)2+(a-a)2]  dt 
‘o 

and  hope  that  the  stability  takes  care  of  itself. 
The  construction  of  H  can  now  be  done  by  solving 
the  following  optimal  control  problem' 

,  ll 

nin  J(H)  |  J(H)  -  /  XTQXdt,  Q  -  I  -I  \ 
HE2'  t(5  -II’ 

subject  to  the  Differential  equations  constraints 
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f(xva,u,t) 

’  0* 

a 

a 

a 

0 

♦ 

0 

X1 

f(x^ ,a,u,t) 

H 

a 

0 

(y-y) 


X(t0)  .  [x1(t0)!a(t0)T.;i(t0)T,a(t0)T] 


Note.  1)  H  will  be  dependent  on  the  X ( tQ )  used  in 
its  construction,  so  when  it  is  used  to  estimate 
a  (when  a  differs  from  a ( tQ ) )  it  has  little  chance 
of  being  the  optimal  H.  So  even  though  we  use 
optimization  techniques  to  construct  H,  it  will 
not,  in  general,  be  ootimal .  2)  Several  observers 

may  need  to  be  constructed,  each  one  convergent 
for  a  in  a  different  region. 

Experience  indicates  that  only  a  few  compon¬ 
ents  fail  at  3  time.  Because  of  this  a  reasonable 
approach  is  to  construct  an  observer  for  each  com¬ 
ponent,  (thereby  minimizing  the  dimension  of  the 
augmented  state  vector)  and  estimate  the  para¬ 
meters  for  each  comoonent  in  parallel.  Observers 
can  also  be  built  for  the  coonon  two  and  three 
element  faults. 
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A is tract 

This  paper  generalizes  the  classical  House¬ 
holder's  formula  to  certain  nonlinear  operators. 
This  class  of  nonlinear  operators  Is  shown  to 
be  cannon  In  circuit  theory.  Several  examples 
are  provided  that  show  where  these  operators 
occur  and  the  result  Is  applied. 


1.  Introduction 

The  purpose  of  this  paper  Is  to  present 
a  technique  for  analyzing  lumped  analog  systems 
with  some  linear  ^nd  some  nonlinear  elements. 

It  is  shown  that  such  a  system  is  described 
by  an  operator  of  the  form 
(1.1)  B-rYoO 

where  8  and  0  are  linear  and  Y-is  nonlinear. 

Since  no  assumptions  are  made  about  the  nature 
of  the  non  linearities,  it  is  impossible  to  view 
the  operator  YoO  as  small  in  any  sense,  hence. 

YoO  has  to  be  viewed  as  a  large  nonlinear  pertur- 
bation  of  the  linear  operator  B. 

The  technique  to  be  presented  is  based  on 


a  theorem  that  allows  Os  to  invert  (1.1)  in 
two  steps.  First,  invert  the  linear  operator 
B.  If  there  are  linear  elements  and 
nonlinear  elements,  B  will  be  an  (N^+NN)x(N^+-NN) 
matrix.  Second,  invert  a  nonlinear  operator 
of  rank  N^.  That  such  a  result  exists.  Is  not 
surprising.  Those  experienced  In  solving 
equations  involving  such  operators  apply 
Gaussian  elimination  until  there  are  nonlfnear 
equations  in  unknowns.  Another  way  to  see 
that  this  segregation  can  be  accomplished  is  to 
view  the  nonlinear  elements  as  a  “load"  on  an 
appropriate  linear  circuit,  in  much  the  same 
way  as  a  circuit  with  one  nonlinear  element  is 
analyzed  by  viewing  that  element  as  the  load 
and  finding  the  Thevenin's  Equivalent  circuit 
that  it  sees. 

The  main  result  of  this  paper  is  obtained 
by  generalizing  to  operators  of  the  form  8+YoO, 
a  classical  theorem  concerning  linear  operators 
known  as  Householder's  Formula.  This  classical 
result  and  its  generalization  are  stated  and 
proven  in  section  2.  In  section  3,  we  show 
such  operators  do,  indeed,  occur  in  circuit 
theory  and  then  two  examples  are  presented.  The 
results  are  sunnarized  in  section  4. 
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.  Section  2 

The  classical  Householder's  formula  [1]  pro¬ 
vides  a  means  of  calculating  the  inverse  of  the 
matrix  B+CD  in  terms  of  B*1  and  (I+DB^C)"1  If 
S'*  is  known  and  if  the  dimensions  of  C  and  0  are 
appropriate,  then  a  great  savings  in  time  and 
effort  can  be  realized  using  this  technique. 

Theorem  1:  (Classical  Householder's  Formula) 

If  B  is  an  Nxfi  matrix,  C  is  an  NxP 
matrix,  and  0  is  a  PxN  matrix,  then 
(3+C0)'T«B*1-B‘1C(I+DB'1C)'10B'1. 

In  the  nonlinear  extension,  the  linear 
operator  C  is  replaced  by  the  nonlinear  operator  Y. 
The  proof  of  this  extension  looks,  at  first 
glance.  Ilka  the  proof  of  a  linear  rather  than 
a  nonlinear  theorem.  To  see  that  this  is  indeed 
a  nonlinear  resulty,  the  differences  between  the. 
nonlinear  and  linear  operator  algebra  will  be 
reviewed  by  giving  two  basic  definitions. 
definition  1 :  (Operator  Aadition)  Let  f  and  g  be 
two  ooerators  (linear  or  nonlinear) 
with  the  same  domain,  then  the 
operator  f+g  is  defined  by  the 
following 

(2.1)  (f+g)(x)*f(x)+g(x) 

definition  2:  (Linearity)  an  operator  f  is  linear 
if  for  all  x  and  y  in  its  domain 
and  all  scalars  a  and  3 

(2.2)  f(ax+3y)*f (sx)+f (Sy ) 
■*f(x)+sf(y). 

The  argument  distributes  to  the  left  for  all 
operators,  but  the  ooerator  distributes  to  the 
right  only  *or  linear  operators.  With  this 
distinction  in  mind,  we  are  ready  to  state  and 
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and  prove  our  main  result  which  is  a  closed  form 
expression  for  (B+YD)'1  in  terms  of  B*1  and 
(M)6"1Y)"1  (of  course,  operator  multiplication 
is  to  be  interpreted  as  composition  i.e.  YD^YoO.  ^ 
iheorem  2:  If  i)B  and  D  are  linear  operators, 

1 1 )8-1  exists,  iii)y  is  an  arbitrary 
operator  and  1v)B+YD  is  defined,  then 

(2.3)  (B+Y0)'1*B'1-B'1Y(I+0BY)'1DB*1.  1 
Proof:  Consider  the  operator  X+XYX  where  X  is 
linear  and  Y  is  possibly  nonlinear. 
X+XYX»X(I+YX)»(I+XY)X. 

- 

If  (I+YX)  and  (I+XY)  are  both  invertible 

(it  can  be  shown  [2]  that  one  is  invertible 

if  and  onlv  if  the  other  is)  we  have 
-1  -1 

(2.4)  (I+XY)  X*X( I+YX)  t. 

Now  consider  the  identity 

l*( I+YX) ( I+YX)'1 »I(I+YX)'1-YX( I+YX)*1 

*(I-‘-YX)'1+Y(I+XY)'1X 

where  we  have  used  (4).  Solving  for 

(I+YX)*1  yields 

(5)  (I+YX)'1«I-Y(I+XY)‘IX. 

Finally,  consider  the  operator  8+Y0. 

(8+Y0)*1«[(I+YD3'1)3]'1*3'1(l+YC!8'1)*1. 

.1 

Letting  X*0S  in  (5)  yields 
(8+YD)'1*B*1[I-Y(I-‘-0B'V'1DB'1  ] 
»8*1-3''r(i+08'1Y)*1De'1. 

I 

To  see  how  this  result  is  useful,  consider 

the  case  wnere  3+Y0  is  an  Nth  orcer  nonlinear 

ooerator,  D  a  linear  operator  that  maps 
N  P 

|R  -|R  ,P<N  and  Y  a  nonlinear  operator  that  maos  j 
JRP-|RN.  This  result  allows  the  solution  of  h 
nonlinear  equation  in  N  unknowns  to  te  replaced 
by  the  solution  of  N  linear  equations  in  N 
unknowns  and  also  the  solution  of  P  (recall  P<N)  )j 


nonlinear  equations  in  P  unknowns.  Thus,  we  have, 
via  a  closed  form  expression,  ordered  our  equa¬ 
tions  and  unknowns  properly  to  make  maximum  use 
of  linear  techniques  and  minimum  use  of  nonlinear 
techniques. 

It  should  be  noted  that  the  proof  of  Theorem 
2  relied  on  the  fact  that  3  and  0  were  linear 
operators  and  allowed  Y  to  be  arbitrary.  B 
and  0  were  not  assumed  to  be  matrices  and  Y  was 
not  assumed  to  map  | RP— { Any  or  all  of  the 
operators  could  be  differential  operators  and  the 
result  would  still  be  valid.  Regardless  of 
whether  the  operators  are  differential  or  func¬ 
tional,  we  have  succeeded  in  breaking  it  up 
into  a  linear  portion  and  a  nonlinear  portion. 

If  there  are  few  nonlinear  components  in  compar¬ 
ison  with  the  nunber  of  linear  components,  the 
non  linearities  can  be  viewed  as  a  perturbation  on 
the  linear  system. 

Section  3 

The  purpose  of  this  section  is  to  show  that 
operators  of  the  form  3+YO  occur  in  circuit 
analysis  problems  and  to  apply  Theorem  2  to  two 
examples. 

This  type  of  operator  arises  naturally  in 
nonlinear  network  analysis.  Consider  the  Node 
analysis  of  a  network  with  reduced  Incidence 
matrix  A,  [3].  Kirchoff's  Laws  are 
(KCl)  Ai-O. 

(KVl)  v»ATe. 

The  branch  equations  might  be 

i«Gv+^-Svj+f(v) 

where 

j*the  branch  current  vector; 


y^the  branch  vol  tage  vector; 
ij^the  current  source  vector; 

Vj^the  voltage  source  vector; 

the  mode-to-datum  voltage  vector; 

G  is  assumed  to  be  an  "invertible"  matrix  of 
differential  operators,  f  is  a  nonlinear  differ¬ 
ential  operator,  and  all  branches  are  voltage 
controlled.  If 
i&AGVj-Aj^ 

then  Kirchoff's  Laws  and  the  branch  equations 
can  be  combined  to  yield 

(3.1)  (AGAT)e+Af(ATe)-i. 

Letting  AGAT«8,  and  Af(*)“Y,  we  see  that  this 
operator  is  of  the  desired  form.  The  typical 
situation  is  for  f  to  be  a  function  of  only  a  few 
(p<n)  linear  combinations  of  the  components  of  v^ 
then  (3.1)  can  be  rewritten  in  the  form 
(3.2)  8+Y(CAT)  £*1 

which  is  precisely  the  type  of  operator  that  is 
ammenable  to  the  results  of  Theorem  2. 

We  now  apply  Theorem  2  to  the  problem  of 
solving  two  nonlinear  simultaneous  equations  in 
two  unknowns.  Consider  the  following  nonlinear 
equations 


f(X) 


hX.-lX-* 

>sX{«jX| 


*i 


In  order  to  apply  Theorem  2,  f(x)  must  be  put  into 
the  form 


(B+YD)(X) 

where  3  is  an  invertible  matrix.  One  way  to  do 
this  is 

n  -n  xfi 

ftXj-O+YOJx-h^  t  I  X(Jj  +Y([1,0]X) 
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where 


obtained  from  the  component  connection  model  [2] 
of  a  system.  The  component  equations  are  assumed 
fh ere)  to  be  given  In  state  form 


u> 


Y( 


0 


Theorem  2  says  that 

X*8*lZ*8"^T(I'f08*lY)"108"1z»X^-Xjj^ 

where 


Xl*8 


,  f  1  1*1  24"]  27 

’  *1-1  U  3J  “  L=2lJ  * 


and 


X:a.e*1Y(I+08*lY)*108-lZ 


•e^Yd+os*^)*^ 


■B’1y(I+0B”1Y)"127. 


Now 


(I+0B"lY)”127-u 


is  equivalent  to 
(I+06*lY)u-27. 


(I+08*lY)u«u+[l,0]  [|  J]  £  0 

ru3-ul  3  3 

*u+[l  1]  I  q  J*u+u  -u*uJ*27 


which  implies 
u-3 

Now 


X,,L*8'lY(3)' 


— 1 

*— • 

a— • 

l3)3-(3] 

1 

N 

► 

y  1. 

r 

0 

1 _ 

_-2*_ 

So 


v  y  y  f  27-24“]  ,  f  3l 

*  VSl  Lzi+24^  |_3j 


The  reason  for  choosing  a  functional  example 
Is  that  for  large  circuits  or  systems,  the  differ¬ 
ential  equations  are  solved  numerically  so  at  each 
iteration,  an  operator  of  the  form  B+YD  must  he 
inverted.  To  see  that  this  is  indeed  the  case, 
consider  discretizing  the  differential  equations 


X-f(X.a) 

b-g(X.a). 


where  a  is  the  vector  of  component  Inputs, 
b  Is  the  vector  of  component  outputs,  and  x  Is  the 
state  vector  of  the  components.  The  connection 
equations  (KVL  and  KCL  equations)  are  given  by 


,F"  li2]  »1 

1^21  *"22-*  U-1 


where  u  Is  the  vector  of  system  Inputs  and  y  Is 
the  vector  of  system  outputs.  If  we  order  the 
entries  In  all  of  the  vectors  correctly,  we  can 
partition  the  vectors  in  the  following  manner 


/f  hNl  '  J1 

a-*U  ,  b-^Lj  ,  and  x»*L 


where  the  superscript  N(l)  denotes  entries 
associated  with  the  nonlinear  (linear)  components. 
The  descretized  equations  thet  the  computer  is  to 
solve  have  the  form 


i*Q 


diXk-1mff<fXI!,a^. 


r 

Z 

1*0 


dixk-i-AxSli- 


N„N, 


bt"-g"(X 


a  ") 

k  ,ak  >• 


bkL,C\‘**0»k. 


.  N NN.  N . .  NL.  L,  N 

ak  Vll bfc  L1 1  bk‘H'  1 2ufc • 


A,,  LN,N . .  LLulj,  L 

VLnVLnbkM-i2uk- 


V-aOk^W^k 
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The  last  equation  is  just  the  output  equation  and 
is  not  used  during  the  iterations.  These  equations 
can  be  put  in  the  following  form 


TVY(0Wk) 

a 

-Ido 

0  0 

0 

-1  0 

— 

*k 

0 

0  0 

0 

0  0 

aN 

ak 

0 

0  A-dol 

3 

0  0 

4 

0 

0  C 

0 

0  -I 

•t 

0 

I  0 

0 

.  NN  .  NL 

Hi  *Hl 

0 

0  0 

I 

-L  MN  -LLL 

L11  Hi 

■y 

♦ 

0 

• 

AVr-t 

0 

0 

0 

0 

_  0  _ 

_  0  _ 

where  O-CI.O],  and  I  1*  conformable  with 
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4.  Conclusion 

The  classical  Householder's  Formula  has  been 
generalized  to  certain  nonlinear  operators.  It 
was  shown  that  these  nonlinear  operators  occur 
in  circuit  theory,  both  In  the  differential 
equations  that  describe  the  circuit  and  in  the 
discretized  equations  that  are  used  In  the 
computer  aided  analysis  of  these  circuits.  It 
is  hoped  that  this  result  will  be  as  useful  a 
tool  in  the  fault  analysis  of  nonlinear  circuits 
as  the  classical  result  turned  out  to  be  in  the 
fault  analysis  of  linear  circuits. 


14.  Reprint  of  "CAD  Oriented  Measures  of  Testability"  by  R.  Saeks  from  the 
Proceedings  of  the  Industry/Joint  Services  Test  Conference 
and  Workshop,  NSIA,  San  Diego,  April  1978,  pp.  71-72. 


ABSTRACT 

Measures  of  testability  for  both  analog  and  digital  systems  which  can  be  incorpor¬ 
ated  into  a  computer-aided  design  package  are  surveyed.  The  application  of  these 


measures  for  evaluating  and  improving  system 


Although  maintenance  related  ques¬ 
tions  have  historically  been  given  low 
priority  in  system  design  with  the  advent 
of  integrated  circuit  technology  during 
the  past  decade  the  cost  of  maintenance 
has  become  a  dominating  factor  in  deter¬ 
mining  the  system  life-cycle  costs.  As 
such,  considerable  interest  in  the  de¬ 
sign  of  systems  which  are  readily  testa¬ 
ble  has  developed  along  with  a  concomitt- 
ment  interest  in  the  development  of ,a- 
quanitative  measure  of  testability.  ' 

The  latter  may  be  used  to  aid  in  the  de¬ 
sign  of  readily  testable  systems.  More¬ 
over,  such  a  measure  of  testability  can 
be  employed  as  a  means  of  specifying 
system  testability  for  acquisition  pur¬ 
poses. 

Two  classes  of  testability  measures 
have  been  proposed  both  of  which  have 
applicability.  The  first  is  a  coarse 
measure  of  testability  which  can  be  com¬ 
puted  by  hand  from  an  inspection  of  the 
system.  This  might  include  numbers  of 
input  and  output  test  points,  number 
and  complexity  of  system  components, 
memory  complexity,  etc.  Alternatively, 
one  might  choose  to  adopt  a  more  sophis¬ 
ticated  measure  of  testability  using 
computer  aid  for.  its  evaluation.  Indeed, 
with  the  growing  prevalence  of  CAD  pack¬ 
ages  in  the  design  houses  of  subrouting 
for  computing  a  measure  of  testability 
during  the  design  process  could  be  in¬ 
corporated  into  an  existing  CAD  package 
with  little  difficulty. 

Although  at  the  time  of  this  writing 
no  clear  criterion  for  defining  a  measure 
of  testability  has  yet  to  emerge  several 
approaches  are  presently  under  investi¬ 
gation.  one  such  approach  is  based 

on  the  concepts  of  controllability  and 


diagnosability  is  discussed. 


observability  intuitively  deeming  a  sys¬ 
tem  to  be  "more  testable"  if  it  is  "more 
controlable  and  observable".  Since  con¬ 
trollability  and  observability  are 
measures  of  one’s  ability  to  exercise  the 
internal  system  elements  via  external  in¬ 
puts  and  observations  such  a  viewpoint 
is  quite  reasonable.  Unfortunately,  as 
classically  defined,  controllability  and 
observability  only  measure  one’s  ability 
to  exercise  the  system  memory  elements 
and  hence  some  type  of  extension  of  the 
concept  is  required  if  the  resultant 
measure  of  testability  is  to  include  com¬ 
binational  information  as  well. 

An  alternative  approach  primarily 
intended  as  a  measure  of  testability  for 
analog  circuits  has  recently  been  proposed 
by  Sen  and  the  author. 3# < *5  Here,  one 
uses  the  implicit  function  theorem  to 
estimate  the  dimension  of  the  manifold 
of  arbitrary  parameters  resulting  from 
the  solution  of  a  set  of  "fault  diagnosis 
equations"  with  a  lower  dimensional  am¬ 
biguity  set  indicating  a  "more  testable" 
systems.  Unfortunately,  this  approach  to 
the  formulation  of  a  measure  of  testa¬ 
bility  is  heavily  predicted  on  the 
differentiable  properties  of  analog  sys¬ 
tems  with  considerable  work  still  re¬ 
quired  for  its  extension  to  the  digital 
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of  sparse  matrices,  computation  of  the  roots  of  a  family  of  polynomials  or 
nonlinear  equations,  solution  of  the  eigenvalue  problem  for  a  family  of  sparse 
matrices,  etc.  The  goal  of  the  present  work  unit  is  to  develop  a  class  of 
continuation  algorithms  for  the  solution  of  such  problems  in  which  one  formu¬ 
lates  a  nonlinear  differential  equation  whose  trajectories  represent  the  solu¬ 
tions  to  the  given  family  of  problems  as  a  function  of  the  underlying  parameter. 
One  then  computes  the  solution  to  the  given  problem  at  one  parameter  value  by 
classical  techniques  and  numerically  integrates  the  differential  equation  using 
this  value  as  an  initial  condition  to  obtain  solutions  to  the  entire  family  of 
problems.  We  believe  that  these  techniques  are  far  more  efficient  than  the 
classical  technique  of  discretizing  the  parameter  value  and  applying  standard 
numerical  techniques  at  each  point.  Indeed,  this  has  been  born  out  by  our  pre¬ 
liminary  experience  in  applying  continuation  algorithms  to  large  scale  systems 
problems . 

*  NSf  grant  for  related  work  applied  to  the  computer-aided  design  problem. 
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The  research  may  naturally  be  subdivided  into  two  areas;  the  formulation 
of  analysis  and  design  techniques  for  large  scale  systems  and  the  development 
of  numerical  methods  for  their  solution.  The  former  area  includes  system 
simulation  algorithms,  large  change  sensitivity  analysis  algorithms,  a  multi¬ 
variate  Nyquist  theory  and  several  root  locus  algorithms.  In  the  numerical 
area  we  have  developed  continuation  algorithms  for  inverting  sparse  matrices 
and  for  the  solution  of  the  eigenvalue  problem  in  a  family  of  sparse  matrices. 
Additionally,  we  have  formulated  several  root  locus  algorithms  and  a  method  for 
tracking  the  solutions  of  a  parameterized  family  of  nonlinear  equations. 

•The  major  result  obtained  during  the  year  has  been  the  formulation  of 
several  continuation  algorithms  for  the  solution  of  the  eigenvalue  problem 
in  a  family  of  sparse  matrices.  Although  a  continuation  algorithm  for  the 
solution  of  the  eigenvalue  problem  has  been  known  for  a  number  of  years  the 
existing  algorithm  uses  the  eigenvectors  as  auxiliary  variables.  As  such, 
since  the  matrix  of  eigenvectors  for  a  sparse  matrix  is  non-sparse  this  al¬ 
gorithm  fails  to  preserve  the  sparseness  of  the  given  matrix.  We  have  there¬ 
fore  developed  three  alternative  continuation  algorithms  in  which  the  auxiliary 
variables  take  the  form  of  appropriate  similarity  transformations  for  the  given 
family  of  matrices  which  are  assured  to  be  sparse  if  the  given  family  of 
matrices  is  sparse. 

In  other  areas  we  have  developed  a  continuation  algorithm  which  is 
employed  to  find  multiple  roots  of  a  polynomial  with  applications  to 
root  locus  problems.  The  latter  application  is  also  represented  by  re¬ 
prints  of  two  papers  in  which  various  root  locus  algorithms  are  formu- 
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lated.  Additionally,  we  have  included  a  reprint  of  a  paper  from  the  IEEE 
Proceedings  in  which  a  continuation  algorithm  for  sparse  matrix  Inversion  is 
developed  and  reprints  of  two  conference  papers  on  the  solution  of  p&rameteri zed 
families  of  nonlinear  equations. 

7.  Publications  and  Activities: 

A.  Refereed  Journal  Articles 

1.  Pan,  C.T.,  and  K.S.  Chao,  "A  Computer-Aided  Root-Locus  Method", 

IEEE  Trans,  on  Automatic  Control,  Vol .  AC-23,  pp.  856-860.  (1978). 

2.  DeCarlo,  R.A.,  and  R.  Saeks,  "A  Root  Locus  Technique  for  Inter¬ 
connected  Systems",  IEEE  Trans,  on  Systems,  Man,  and  Cybernetics, 

Vol.  SMC-9 ,  pp.  53-55,  (1979). 

3.  Saeks,  R.,  "A  Continuation  Algorithm  for  Sparse  Matrix  Inversion", 

IEEE  Proc.,  Vol.  67,  pp.  682-683,  (1979). 

4.  Pan,  C.T.,  and  R.  Saeks,  "Multiple  Solutions  of  Nonlinear  Equations: 
Roots  of  Polynomials",  IEEE  Trans,  on  Circuits  and  Systems  (to 
appear) . 

B.  Conference  Papers 

1.  Pan,  C.T.,  and  K.S.  Chao,  "Multiple  Solutions  of  a  Class  of  Nonlinear 
Equations",  Proc.  of  the  1979  IEEE  Inter,  Symp.  on  Circuits  and 
Systems,  Tokyo,  July  1979,  pp.  577-580. 

2.  Pan,  C.T.,  and  K.S.  Chao,  "A  Continuation  Method  for  Finding  the 
Roots  of  a  Polynomial",  Proc.  of  the  22nd  Midwest  Symp.  on  Circuits 
and  Systems,  Univ.  of  Pennsylvania,  Philadelphia,  June  1979,  pp. 
428-431 . 

C.  Preprints 

1.  Green,  B.,  "Continuation  Algorithms  for  the  Solution  of  the  Eigenvalue 
Problem",  (preliminary  draft) . 

D.  Theses 

1.  Green,  B.,  "Continuation  Algorithms  for  the  Solution  of  the  Eigenvalue 
Problem",  M.S.  Thesis,  Texas  Tech  Univ.,  1979. 

E.  Conferences  and  Symposia 

1.  Chao,  K.S.,  21st  Midwest  Symposium  on  Circuits  and  Systems,  Iowa 
State  Univ. ,  Aug.  1978. 
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2.  Chao,  K.S.,  22nd  Midwest  Symposium  on  Circuits  and  Systems,  Univ. 
of  Pennsylvania,  June  1979. 

3.  Chao,  K.S. ,  1979  IEEE  Inter,  Symp.  on  Circuits  and  Systems,  Tokyo, 
July  1979. 

4.  Green,  B.,  Texas  Systems  Workshop,  Dallas,  March  1979. 

5.  Chao,  K.S. ,  Texas  Systems  Workshop,  Dallas,  March  1979. 

6.  Chao,  K.S.,  Inter.  Colloq.,  on  Circuits  and  Systems,  Taipei, 

July  1979. 
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8.  Reprint  of  A  Computer-Aided  Root-Locus  Method",  by  C.T.  Pan  and  K.S.  Chao 
from  the  IEEE  Transactions  on  Automatic  Control,  Vol.  AC-23, 
pp.  856-860.  (1979). 

A  Computer-Aided  Root-Locus  Method 
C  T.  PAN  AMD  JC  S.  CHAO,  memmx.  bb 
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Abstract — Aa  til Ideal  computer-tided  root-loan  method  b  described. 
The  approach  b  based  oa  (be  coocept  of  coatinuxtioa  methods  is  which  the 
soltidoa  of  t  parameterized  family  of  algebraic  problems  b  coaverted  into 
the  soiatloa  of  a  differential  equation.  The  root-locus  plot  b  obtained  in  a 
systematic  manner  by  numerical  Integration.  Singularities  are  analyzed 
and  rlartifbid  trrording  to  the  properties  of  higher  order  derivatives. 
Depending  oa  their  classification,  singular  points  oa  the  root  lod  are  taken 
care  of  accordingly. 

I.  Introduction 

The  -root-locus  method  is  one  of  the  important  design  techniques  for 
linear  time- in  variant  feedback  systems.  In  addition  to  yielding  frequency 
response  information  of  the  system,  it  also  provides  a  powerful  tool  for 
solving  problems  in  the  time  domain.  The  basic  idea  of  the  root-locus 
method  is  to  determine  the  closed-loop  pole  configuration  as  a  function 
of  the  gain  from  the  configuration  of  the  open-loop  poles  and  zeros.  A 
great  deal  of  infonnation  is  available  in  texts  and  literature  on  the 
method  for  the  constriction  of  root  loci.  The  graphical  method  using 
certain  elementary  geon.etric  properties  of  the  locus  is  probably  the  most 
commonly  used  appro  . ch  (see  e-g.  [1J— [3D-  Other  approaches  [4]-[7] 
employ  either  analytic  or  semi-analytic  representations  that  involve  the 
use  of  equations  of  the  iod.  Although  analytic  approaches  enable  one  to 
obtain  accurate  plots  along  with  certain  qualitative  features  of  the  root 
paths,  the  point-to-point  plotting  is  just  a  formidable  task.  Besides, 
investigations  for  higher  order  systems  are  virtually  impractical. 

It  is  the  purpose  of  this  paper  to  develop  a  computer-aided  method  for 
plotting  root  lod  in  a  systematic  manner.  The  approach  to  be  presented 
in  Section  II  is  based  oa  the  concept  of  continuation  methods  [8]— {JOJ. 
The  basic  idea  is  to  convert  the  solution  of  a  parameterized  family  of 
algebraic  problems  into  the  solution  of  a  set  of  associated  differential 
equations.  Section  III  is  concerned  with  the  existence  and  dassification 
of  singular  points  on  (he  root  loci.  In  Section  IV  the  results  obtained  are 
illustrated  by  means  of  examples. 

IL  Thz  Root-Locus  Method 

Consider  the  closed-loop  system  shown  in  Fig.  I.  Let  the  open-loop 
transfer  function  be  expressed  by 

'  B(s ) 

where  K  is  the  open-loop  gain  and  m  </i.  The  closed-loop  transfer 
function  is 


T(s)m 


C(s)  G(t)B(s) 

I  +  G(s)H(s)  B(,)+KA(s)' 


H(,l 


i-e. 


Fig.  I.  A  cloved  leap  Iwlliirt  qrwa. 


g(s,X)-B(i)+XA(s)-0. 


/-{s  |g(s.X-)-0}. 


(3) 


(4) 


Instead  of  solving  the  roots  of  (3)  directly  for  each  K,  a  system  of  two 
simultaneous  differential  equations 


■Zt(s(t).x(0)--t('(  O.M).x(s(0).xm-o 


i  *(')-= i. 


*(0)-*o 


(5) 


is  considered  where  s(0 )  —  *o  **  *  root  of  [3]  corresponding  to  an  initial 
gain  Kq.  and  t  is  a  dummy  variable.  Application  of  the  chain-rule  to  (3) 
results  in 


$-(<*&)/!•  '<«-<• 


«..i 

dt 


(6) 


or  equivalently. 


ds  (B(s)  +  KA(s))~A(s) 

dt  B'(s)  +  XA’(t) 


*(0)-so 


dt  -*• 


m -Ko- 


cn 


Equation  (7)  can  now  be  solved  by  any  numerical  integration  technique. 
For  example,  using  Euler’s  method,  (6)  reduces  to 


2f(r,)±R^f(rt)  +  -4(rt) 
B\tk)+K^{sk) 


(8) 


(2) 


The  root-locus  plot  of  the  closed-loop  transfer  function  T{s)  is  defined  as 
the  locus  of  the  poles  of  T[t)  when  K  varies  from  zero  to  infinity.  This 
plot  consists  of  a  set,  denoted  by  /,  of  points  in  the  r-plane  sue  a  that 
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K^t-Xk±h. 

It  is  seen  from  the  solution  of  (3) 

g(z.A)-g(s(0).A(0))e-'-0e-'»0 
K-  £/ 

that  for  any  admissible  pair  Kq  and  s0  satisfying  (3),  the  corresponding 
trajectory  resulted  from  (3)  will  remain  oa  the  solution  curve  gfj.A)— 0 
as  X  changes.  The  +  or  -  sign  is  ebon  depending  on  whether  one 
would  like  to  increase  or  decrease  X.  Since  the  computed  trajectory  may 
not  satisfy  (3)  exactly,  the  minus  sign  in  from  of  g  in  (5)  it  used  to  ensure 
that  the  computed  trajectory  docs  not  diverge  away  from  the  locus. 

It  is  a  well-known  facr  that  the  root-locus  plot  for  7T>)  contains  a 
branches  starting  from  the  open-loop  poles  at  X—0.  Therefore,  in  the 
case  where  the  open-loop  poles  are  distinct,  the  a  initial  conditions  for 
(7)  are  selected  at  A(0)-0  and  s(Q)mpn  i— 1,2,- -  -  ,a.  Ia  the  case  where 
the  open-loop  transfer  function  contains  repeated  poles,  the  term  (3g/3s) 
becomes  zero  when  evaluated  at  the  repeated  poles.  As  t  result,  the 
selection  of  starting  points  cannot  be  made  at  X—0  lor  the  repeated 
poles.  Approximate  starting  points,  however,  can  easily  be  obtained  by 
analyzing  the  properties  of  the  root  lod  in  the  neighborhood  of  the 
repeated  pole.  Suppose  the  open-loop  gain  has  s  repeated  pole  pt  with 
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multiplicity  r  and  the  corresponding  g(s,K)  has  the  form 
*i.K)-BU)+KA(s) 


where  V  and  V  are  real-valued  functions  of  x  and  y.  and 


xmx+jy.  (19) 

“(*~Pi)(j-"Pi)- ••(»“£*)  "•(*— P,_,)+Am4(*)—0.  (9)  Th*  directional  derivative  of  it  at  a  point  r6 /  in  the  unit  direction 

4-yo,  tangential  to  the  root  locus  is 


w-p4+ As (10) 

where  «  is  an  arbitrarily  small  real  number  and  t  is  a  phase  angle  yet  to 
be  determined.  Thus  at  s—  w 

g(w,X)-g(/*+cr*X)-0.  (11) 

Solving  dr  from  (1 IX  gives 

d r-«*'«(X(*>*),/'  (12) 


Therefore  r  approximate  starting  points  in  the  neighborhood  of  the 
repeated  polepj  and  the  corresponding  open-loop  gain  can  be  evaluated 
from  (12)  as 

/- 0,1,2,  ••*,/—  1 


AT—  —t'. 
P 


^-Vf/(x.y>o-l|e,  +  H«v.  (20) 

The  above  equation  can  also  be  written  in  the  form 

,6A  (2I> 

Making  use  of  the  Cauchy-Riemann  condition.  (21)  reduces  to 

(22) 

where  K'  —  dK/ds. 

Similarly,  higher  order  directional  derivatives  of  Jf  with  respect  to  f  are 
related  to  the  higher  order  derivatives  by 

Ref  *(->(,)*>-].  (23) 

Thus,  it  is  seen  from  (23)  that  along  /,  K  and  its  directional  derivatives 
are  all  real-valued  functions.  The  following  theorem  which  plays  an 
important  role  in  singularity  classification  will  be  proved. 

Theorem:  Suppose  p(r)  is  an  analytic  function  such  that 


With  the  proper  choice  of  n  starting  points,  the  n  branches  of  the 
root-locus  plot  can  be  traced  in  a  continuous  manner  by  numerical 
integration.  In  computing  the  root  locus,  care  must  be  exercised  when 
approaching  a  singular  point  on  the  locus. 


p(r*)“ a,  for  a  real  «**<), 

/><*>(s»)-a  k-l.V-.f-l. 


III.  Singula*  Points 


A  point  r*  satisfying 


2  <q<n 

at  some  point  r*  located  on  Im p(r)-0.  Let 
/^-{r|Imp(s)-0}. 


Then,  in  the  neighborhood  of  r*,  R,  consists  of  q  branches,  JL, .JL*-  •  • , 
—  I  m  —  ^v«  ***  ^.iH^in-”  n/^-r*.  Furthermore,  for  each!,  I<i<q, 

ds  ds  \  A (j)  Rep{j)lj^  is  either  a  local  maximum  or  a  local  minimum  at  r*  if  q  is 

A(j‘)B’(s‘)~  R(j*)d'(i*)  even;  it  is  either  an  increasing  function  or  a  decreasing  function  if  a  is 

-  Ar«)  "°  (l5)  «“• 

Proof:  Without  loss  of  generality  r*  be  assumed  to  be  zero. 

in  the  complex  plane  is  called  a  singular  point.  Since  the  numerator  of  ®*^ore  C0IU’<*er™®  the  ®cnCT*^  c“e’  the  tieorcnl  **  Prov** for 
(IS)  is  a  (n  +  m—  l)th  order  polynomial  of  real  coefficients,  there  are  h(r)  — 

(n+m-  I)  singular  points  in  the  r-p!ane.  Only  these  singular  points  that 

are  located  on  the  root  loci  will  be  considered.  In  view  of  the  fact  that  Identifying  the  set  Rk  from 
Ah’)  cannot  be  zero  for  a  futile  AT,  it  follows  that  on  the  root  locus,  the 


condiuon 


A  (j’)B’(j') ~  B(*')AW)  -  A  (r*)(R'(i*)  +  KA  '(a*))  -0  (16) 


R*-{r|ImA(s)~0} 


R,  -  {re'Vsinqd-O}  -  {re^-w/g,  i-Q,  1,2,  •  •-  ,q- 1) 


■2£|  •B‘(j0)  +  XA'(s,)-O.  (17)  where  9  “  restricted  in  the  upper  half  of  the  r-plane  and  r  assumes 

"  1*“**  negative  values  in  the  lower-half  plane.  Thus,  Rk  consists  of  q  intenect- 

Hence.  (7)  u  not  -did  at  r-r*  and  modifications  must  be  made  to  ,naehe*  -0.  U.- "  ,q- 1.  The  intersection  occurs  at  r-0.  For 

_ —■ _ r* _ it.; _ /tn  j _ •  . •  .  .  C*CD  *♦ 


handle  these  singu.ar  points.  Condition  (IS)  does  include  the  conven- 
tional  break-m  and  break-sway  points  at  which  K  is  either  a  local  ReA(s)U  -tf+r«cos(qfi)-o  +  r«eos(i»). 

maxima  or  a  local  minimum  on  the  real  axis,  respectively.  In  general, 

(dK/ds)  is  a  complex  quantity  if  r  is  not  located  on  the  real  axis  and  it  Therefore  if  q  is  even,  r*  is  always  nonnegative,  and 
does  not  make  sense  to  talk  about  local  extremal  values  without  proper 

modification.  Now,  since  AT  is  a  real-valued  function  of  r  for  all  r  on  the  ReA(r)|^  >  a  w*lea  co*fT“  • 

root  locus  f.  the  directional  derivative  ( dK/dl )  together  with  its  higher  <  w*lea  co,rT”  “*• 


<  when  cos  nr-  - 1, 


order  derivatives  along  the  tangential  direction  of  the  locus  are  well  ,.e,  A(r)|*.  is  either  x  local  maximum  or  a  local  minimum.  Now  if  q  is 
defined.  It  is  thus  possible  to  consider  local  extremal  values  of  AT  along  f  odd,  then  for  each  i, 
using  the  notion  of  directional  denvatives.  Let 


fce*(*)l*.“«+r«cos(rw). 

^  Hence,  (A(r)l*.  -  a)  will  change  sign  either  from  plus  to  minus  as  r 
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increase*  from  negative  value  to  positive  value  or  vies  vena,  la, 
ia  a  monotonic  function. 

Returning  to  tile  general  ease,  p(i)  can  be  expanded  into  a  Taylor 
scries  around  i*  as 

/(/)-/(/*)+  2  « (r-r*y-o+(x-r*>*  2  c^k(r-s*)k. 

i»*  *- o 

Since  the  summation  in  the  above  equation  is  an  analytic  function  and 
has  no  zero  in  a  small  disc  around  r\  from  a  theorem  in  the  theory  of 
complex  variable*  (see  *.(.  (1  ID.  there  exists  an  analytic  function  u(j) 
such  that 

2  c**,(s-s*)*-ex p(u(j)). 

A— 0 

Let  o(s>— exp  (u(s)/  q).  Then 

p(s)-e+{(s-s«)c(s)]*  *  a+[/(r))« 
whereas*)— 0,/(j*)v*0.  Thus, A*)  is  a  local  homeomorphism.  Now 

{s|Im(e+(/(s))*)-0}  -  {r|/(i)eR*}. 


^-{s|/(s)eR*}.  HU- .*-1. 

If  q  is  even,  then  sefi^  implies /(s)eRu.  It  follows  that 

ReA(/W)>ReA(0)-ReA(/(j*)).  ^ 


when  costir—  1 


COStir—  - 1. 


Therefore  Repfj)!^  is  either  a  local  maximum  or  a  local  minimum  at  j*.  - 
Similarly,  if  ?  is  odd,  then  it  can  be  shown  that  Repfr))^  is  either  an 
increasing  function  or  a  decreasing  function. 

As  a  direct  consequence  of  the  above  theorem,  the  following  corollary 
is  deduced. 

Corollary:  Suppose  r*  is  a  singular  point  on  /  such  that 
JC(r*)sM> 


Fit.  2.  Aar 


PiS.  J.  Aaodda 


v*jr| 

-0, 

at  I ,-»•  o 

d*K\  Q 

2<,<"  p 

p 

where  then  there  are  q  branches  intersecting  at 

r— s*.  Furthermore,  if  q  is  even,  then  along  each  branch  of  the  intersect¬ 
ing  root  loci  at  r— r*,  K{s“)  is  either  a  local  maximum  or  a  local 
minimum;  otherwise  X(j‘)  is  a  monotooic  function  of  r  on  that  branch  * 
in  the  neighborhood  of  sm. 

Proof:  Let  fit)—  -  B(t)/A{i).  Then  fit)  is  an  analytic  function.  It 

follows  that  * 

ti 

B(t)+ f(s)A(t) —0. 

t 

Comparing  the  above  equation  to  (3),  it  is  obvious  that  t 

AT-Re/(r)  f 

0— Im/(r),  for  all  sei.  ^ 

Application  of  the  above  theorem  to  As)  completes  the  proof. 

According  to  the  above  corollary,  singular  points  are  characterized  by  8 
the  properties  of  higher  order  derivatives.  It  is  noted  that 

ii 

*£--£l-B{s)/yH,))--&/A(s).  (24)  * 

Since  g(r)  is  an  /tlh-order  polynomial  with  teal  coefficients,  3*g /9r*  can 
easily  be  generated.  Furthermore,  q  can  at  most  be  equal  to  a.  since 
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With  the  above  corollary,  the  conventional  break-in  and  break-away 
points  defined  on  the  real  axis  can  now  be  generalized  as  follows. 

Definition:  In  the  theorem,  the  singular  point  is  called  an  even  singu¬ 
lar  point  if  q  is  even;  otherwise,  it  is  said  to  be  an  odd  singular  point. 

It  is  dear  from  the  corollary  that  on  tht  root  locus  an  even  singular 
point  is  either  a  local  mtimnm  or  a  local  minimum  along  the  branch  as 
defined  in  the  theorem.  The  conventional  break-in  and  break-away 
points  are  just  special  cases  of  even  singular  points  of  order  2  (i.t,  q—T) 
which  are  located  on  the  real  axis.  In  ^ncral,  if  a  singular  point  is 
located  off  the  real  axis,  the  concept  of  directional  denvau.es  can  be 
used  to  characterize  the  singular  point  Pram  (23)  and  the  corollary  it  is 
obvious  that  at  an  ^th-order  singular  point  dkK/dkl— 0  for  fc— 1,2. 
•  -  •  ,q—  1  and  d'K/dl'^Q.  There  are  q  branches  of  the  root  loci 
intersecting  at  the  singular  points. 

In  generating  the  root-locus  plot  each  branch  is  plotted  separately  as 
K  increases.  In  the  neighborhood  of  an  odd  singular  point  since  K  is  a 
monotonic  function  of  r  on  the  locus,  the  first-order  directional  deriva¬ 
tive  is  a  continuous  function.  Thus,  when  approaching  an  odd  singular 
point  it  is  necessary  to  jump  over  the  singular  point  by  adding  a  small 
variation  |Ax|  along  the  tangential  direction  of  the  locus.  For  even 
singular  points,  such  procedure  is  invalid.  Snce  on  the  root-locus  plot  an 
even  singular  point  is  either  a  local  or  a  local  minimum  such 

construction  does  not  give  rise  to  an  increasing  K.  Thus,  in  order  to 
continue  the  plotting  of  the  root  locus  as  a  function  of  increasing  JC,  it  is 
necessary  to  change  the  direction  of  the  locus  when  an  even  singular 
point  is  approached.  Depending  on  the  <*der  q  of  the  even  singular 
point.  Ax,  the  change  in  direction,  is  chosen  ss 

Ax -As*-"'*  (26) 

where  Aj  is  a  sufficiently  small  vector  in  the  tangential  direction  of  the 
locus  when  approaching  the  singular  poaaL  The  factor  can  be 

viewed  as  a  rotational  operator,  which  rotaaes  the  direction  clockwise  by 
r/ q.  On  a  branch  so  constructed,  Use  even  angular  point  no  longer  has 
the  characteristics  of  s  local  extremum. 

It  is  apparent  from  the  foregoing  root  Incus  construction  that  the  q 
branches  will  directly  intersect  each  other  at  an  odd  singular  point  and 
that  the  modified  q  root  loci  will  touch  each  other  at  an  even  singular 
point  For  obvious  reasons,  an  odd  siaplar  point  is  known  as  an 
intersecting  point  while  an  even  singular  point  is  called  a  touching  point 
The  graphical  illustrations  for  these  two  types  of  singularities  are  shown 
in  Fig.  2  and  Fig.  3  for  q—2  and  q—i,  respectively.  The  branches  are 
numbered  and  the  arrows  are  pointed  m  the  direction  of  increasing  K. 

After  the  change  of  direction  at  a  touching  point  or  the  jump  over  an 
intersecting  point,  it  may  be  necessary  to  make  corrections  if  the  point 
selected  is  not  close  enough  to  the  locus.  This  can  usually  be  achieved 
within  a  few  steps  by  using  the  Newton  iwnoon 


B{sk)*KA{,t) 
B{,k  )  +  C4-(r,)‘ 


_ boo  (a  earasunwf  img  ttosi 

ftfffff&WA  IXITTFit  I  STS  SI  lOTi  flOS 


The  gain  AT  which  corresponds  to  the  corrected  point  can  be  evaluated 
either  from 

AT-  -  Re  B(j)/ReA  ( r)  (28) 


K--\mB(s)/\mA{s).  (29) 

Equations  (28)  and  (29)  are  derived  by  taking  the  real  and  the  imaginary 
pans  of  g(r)-0,  respectively. 

IV.  Examples 

In  this  section  a  number  of  examples  are  presented  to  illustrate  the 
proposed  algorithm  for  obtaining  the  root-locus  plou  Although  any 
integration  technique  can  be  used  to  solve  (7),  only  the  Euler’s  method 
with  variable  step  size  is  used  for  illustration. 

Example  I:  Consider  a  linear  feedback  system  whose  open-loop 
transfer  function  is  given  by 


j(j+1)(j+5) 


G(s)H(,)~ 


(r+3)(r+l)J 


where  »•-!  is  a  repeated  pole  with  multiplicity  r- 2.  From  (14) 

»0- 1  +  te**11 
w,-l  + 

JC--U2 

P 


and  <  is  chosen  as  0.2  for  illustration.  Thus,  the  two  approximate  starting 
points  are 

we--l+/0i  AT-0.08 
w,--l-/0i  AT-0.08. 

The  root  loci  are  obtained  by  using  (8)  with  r(0)— »<*>»,, —3  and 
AT(0)- 0.08, 0.08,0,  respectively.  The  results  are  shown  in  Fig.  5  where  the 
root  loci  are  plotted  up  to  AT- 5. 

Example  3:  In  this  example,  the  open-loop  transfer  function  is 
assumed  to  have  one  real  pole,  and  two  complex  conjugate  poles: 


G(s)H(s)~ 


r(r+3+yV3  )(j  +  3->VJ  ) 


There  are  three  simple  poles  at  0,  —  1,  and  —3,  and  two  finite  zeros  at 
- 1_5  and  -  3.3.  Application  of  (7)  with  starting  points  0,  - 1,  and  -  5  at 
AT -0  leads  to  three  root  loci  shown  in  Fig.  4.  It  turns  out  that  all  four 
singular  points  are  located  on  the  root-locus  plot  and  they  are  all 
classified  as  even  singular  points  with  f  — 2.  It  is  thus  necessary  to 
change  the  direction  of  the  root  locus  when  each  singular  point  is 
approached.  For  branch  1,  when  Qt  is  approached,  the  tangential 
direction  hr- -<  and  dr,  ts  chosen  as  (-<)e“/*/J-  +  y.  Similarly, 
when  branch  I  approaches  2>  Q},  and  Q„  the  changes  in  direction  are 
chosen  as  dz2  -(->*)» dZj-(  —  i)e~*,i,  and  dr4-(-»e-/,/J, 
respectively.  Other  branches,  denoted  by  2  and  3,  are  obtained  in  a 
similar  manner. 

Example  2:  As  an  example  of  the  case  with  multiple  poles,  consider 


There  is  only  one  odd  singular  point  with  q — 3  located  on  the  root  loci. 
It  is  thus  necessary  to  jump  over  the  singular  point  when  it  is  approached 
and  dr  is  chosen  in  the  tangential  direction  of  the  locus.  The  complete 
root-locus  plot  is  shown  in  Fig.  6. 

Example  4:  The  final  example  demonstrates  the  case  where  the  singu¬ 
lar  points  are  located  off  the  real  axis.  Consider 

G{s)H{s)-K^r\ - £= - — . 

(s+ !)»(,+ l+>Vl8)(r+ 1 -yVif) 

It  is  seen  that 

g  (r )  -  J(  r )  +  AC4  ( r )  -  r  ‘ + 4r J  +  24s2 + 40r  +  1 9  +  AT. 

Setting  the  derivative  of  g  with  respect  to  r  to  zero,  yields  three  singular 
points,  namely,  r,  -  - 1,  r2  -  - 1 +/3,  and  jj  —  —  1  —  ji.  Since  r,  -  —  1  is  a 
repeated  open-loop  pole,  it  can  be  taken  care  of  as  a  starting  point  At  J2 
and  j3,  it  is  easily  verified  that 

Img(rj)  -  Imy(jj)  —  0 


This  indicates  that  both  i,  and  r5  are  located  on  the  root  loci  and. 
furthei - - 1 — - = — 1 - ; - — '  *- 
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+3+/VJ  )(A*j-yvS  ) 


locus  st  singular  points  and  enables  one  to  plot  the  toot  loci  without 
missing  or  repeating  any  branch.  — . 
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plication  o i  (he  proposed  root-locus  plotting  procedure  with  four  starting 
points 


r,q--l+/e. 

JT-O.IS,  r— 0.1 

3jo- 

JC-0.18,  r-0.1 

*3o-  —  1+yvTi . 

JC-0 

ruo —  i  — yVTa  . 

X-0 

leads  to  (he  complete  root-locus  plot  shown  in  Fig.  7.  It  is  clear  from  this 
example  that  a  necessary  condition  for  the  existence  of  complex  singular 
points  is  that  the  order  of  the  open-loop  transfer  function  be  greater  than 
or  equal  to  four. 

V.  Conclusion 

An  algorithm  for  generating  the  root-locus  plot  has  been  presented. 
Classification  of  singular  points  has  also  been  discussed  in  detail.  It  is 
shown  that  the  conventional  break-m  and  brealt-away  poinu  are  just 
special  rm%m  of  even  singular  points.  The  computer-aided  method 
successfully  solves  the  problem  of  discontinuity  of  the  direction  of  the 
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A  Root  Locus  Technique  Tor  Interconnected  Systems 

RAYMOND  A  DlCARLO,  member,  and 
R.  SAEKS.  PEU.OW,  IEEE 

Abstract — This  note  presents  a  numerically  feasible  technique 
for  computing  composite  system  eigenvalues  from  component/ 
subsystem  eigenvalues  in  the  component  connection  model  con¬ 
text  The  technique  is  a  natural  extension  of  previous  artificial 
methods  of  computing  system  eigenvalues  in  a  state  model  having 
a  perturbed  A  matrix.  The  present  technique  allows  oat  to  trace 
the  movement  of  component  eigenvalues  as  coupling  is  introduced. 
Furthermore,  the  technique  permits  investigation  of  eigenvalue 
movement  as  a  function  of  interconnection  gains.  This  is  useful 
in  analyzing  short/open  circuit  phenomena  as  well  as  other  system 
characteristics.  Finally,  the  technique  is  useful  for  determining  and 
understanding  composite  system  stability  in  terms  of  component 
and  connection  information. 

L  Introduction 

This  note  describes  a  root  locus  technique  in  which  the  “root 
locus“  begins  at  the  eigenvalues  of  the  individual  component  state 
equations  and  traces  out  trajectories  as  coupling  between  com¬ 
ponents  is  continuously  and  uniformly  introduced.  The  compon¬ 
ent  connection  model  [5],  [14],  [16],  [12],  [13],  [4],  [18]  is  the 
natural  vehicle  for  executing  the  approach  as  opposed  to  distantly 
related  perturbation  schemes  [6],  [7],  [11]. 

Numerical  implementation  is  discussed  through  a  continua¬ 
tions  approach  [2],  [3],  [19],  [10],  [1]  which  in  this  case  is  a  set  of 
coupled  differential  equations  characterizing  the  interplay  be¬ 
tween  the  eigenvalues/eigenvectors  of  the  appropriate  composite 
system  matrix  as  a  function  of  an  underlying  parameter  r.  The 
coupled  differential  equations  are  given  in  the  Appendix.  The  idea 
is  to  initialize  these  equations  at  the  eigenvalues/eigenvectc.s  of 
the  components  and  integrate  to  obtain  those  of  the  composite 
system. 

The  technique  is  applicable  to  the  study  of  short/open  circuit 
phenomena  as  well  as  the  investigation  of  the  effect  on  composite 
system  eigenvalues  of  increasin&'decreasmg  coupling  gains  be¬ 
tween  components. 

It  is  assumed  that  each  component  has  a  completely  control¬ 
lable  and  observable  state  model: 

Xi  -  Aj.x,  +  Sj at 

bi-CiXi  +  DiJi  (1) 

where  a,,  b,.  and  x,  are  vectors  of  component  inputs,  outputs,  and 
states,  respectively.  Combining  the  component  descriptions 
defines  a  composite  component  state  model  as 

x  «  Ax  +  Ba 

b  -  C.t  +  Da  (2) 

where  a  -  [a,.  •••,  aj,  b  »  [bt.  — ,  bj.  x  -  [x„  •••.,  x.f, 
A  »  diag  [A,,  •••,  A,],  etc  All  vectors  are  assumed  conformable, 
and  for  each  time  instant  they  possess  values  in  the  appropriately 
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dimensioned  Euclidean  space.  The  system  description  is 
completed  with  the  connection  equations,  which  are 


a 

Ljj_!  L|j 

b 

y 

Lit 

u 

where  are  real  matrices  accounting  for  KVL,  KCL,  and/or 
other  conservation  laws;  y  and  u  are  vectors.of  system  outputs  and 
inputs,  respectively.  Equations  (2)  and  (3)  constitute  the  compon¬ 
ent  connection  model.  Except  for  theoretical  analysis  and/or 
describing  relationships  between  classical  models  and  the  com¬ 
ponent  connection  model,  one  never  combines  (2)  and  (3)  into  a 
single  set  of  equations.  AH  relevant  simulation  and  analysis  is 
possible  without  combining  the  equations  [16}— [18],  [5],  [2],  [4], 

[12H14],  [3]- 

Under  general  considerations  [16],  [5],  [18]  the  composite  com¬ 
ponent  state  vector  x  and  the  system  state  vector  may  be  chosen  to 
coincide,  so  that  a  valid  composite  system  state  model  exists  as 
follows; 

x  »  Fx  +  Gu 

y~Hx+Ju  (4) 

where  F,  G,  H,  and  J  can  be  expressed  in  terms  of  the  matrices  of 
(2)  and  (3).  In  particular, 

F  -  A  +  B{I  -  LXXD)-'LXVC.  (5) 

Since  G,  H,  and  J  are  unrelated  to  the  discussion  of  this  note,  their 
specific  form  is  omitted.  The  component  interconnection  matrix 
L ti  is  explicit  in  (S).  This  explicitness  motivates  and  permits  the 
root  locus  technique. 

II.  The  Root  Locus  Technique 

Clearly  the  composite  system  eigenvalues  are  the  roots  of  the 
polynomial  det  [a/  -  F]  where  F  is  defined  in  (5).  To  consider  the 
effect  of  coupling/interaction  among  components,  replace  Lj  i  by 
rLt ,  for  a  scalar  r  to  obtain 

F(r)  =  A  +  B[/  -  (rLll)D]-,(>-L,,)C.  (6) 

Clearly.  F( 0)  «  A,  as  in  (2).  and  F(l)  =  F.  the  relevant  composite 
system  matrix. 

A  continuations  approach  to  computing  the  root  locus  of  the 
eigenvalues  of  F(r),  0  <  r  s  1,  uses  (21 )— (23 )  from  the  Appendix, 
which  require  knowledge  of  F(r).  Unfortunately,  the  eigenvalues 
of  F(r)  must  be  distinct,  since  the  coupled  differential  equations 


become  singular  otherwise.  Using  the  well-known 
identities. 

matrix 

[M->]  -  - 

(7) 

X(1  -  YX)"  -  {I  -  XYY'X, 

(8) 

whenever  either  inverse  exists,  and 

(/  -  XY)-'  -  (/  +  X(1  -  YX)~'Y) 

(9) 

results  in 

t(r)  =  B(!  -rLuD)-'-L„C 

(10) 

where  the  superscript  -2  indicates  the  inverse  squared. 

By  iniuabzing  (21  >-(23)  at  the  component  (subsystem! 
eigenvalues  'eigenvectors,  one  integrates  the  coupled  differential 
equations  to  obtain  'he  root  locus — i.e..  the  trajectories  of  the 
eigenvalues  of  ;he  system  as  coupling  is  mtroauced.  If  any  of  the 
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eigenvalue  trajectories  cross,  it  is  necessary  to  perturb  r  around 
the  point  of  intersection  and  reinitialize  the  algorithm. 

When  the  composite  state  modei  0-matrix  is  zero,  F(r) m 
BLX  i  C.  a  constant.  When  Dm  0.  it  is  necessary  to  compute 
(/  —  rLxx  D)~ 1  at  each  step  of  the  integration.  Typically, 
rank  (D)  <t  dim  (/).  Viewing  -rLtt  0  as  a  low-rank  perturbation 
on  j.  Householder's  formula  [9],  [5]  provides  a  convenient  and 
quick  means  for  computing  (/  -  rLxx  D)~ '. 

Note  that  (6)  is  nor  an  artificial  linear  perturbation  of  the 
composite  system  F  matrix  to  account  for  stray  capac¬ 
itances/inductances  on  eigenvalue  movement  or  ease  com¬ 
puting  the  eigenvalues  of  F  [6],  [7],  [1 1].  Indeed  the  Qft-algorithm 
is  much  more  efficient  and  accurate.  However,  this  formulation  is 
natural  in  the  component  connection  model  context;  it  is  numer¬ 
ically  implementable  through  a  continuations  approach  as  above 
or  through  an  iterative  calling  of  a  QR-algorithm:  it  identifies 
components  giving  rise  to  particular  composite  system 
eigenvalues  by  tracing  the  eigenvalue  locus  of  F(r);  and  finally,  by 
replacing  Lxx  by  L,,  +  rP  (P  is  of  low  rank),  it  offers  a  means  of 
investigating  coupling  gains  on  composite  system  eigenvalue  loca¬ 
tions  as  follows. 

Assume  the  composite  system  eigenvalues/eigenvectors  are 
known  and  distinct.  To  investigate  the  effects  of  gain  changes, 
replace  L,,  in  (5)  by  (Lxx  +  rP),  where  P  is  a  low-rank  matrix 
Characterizing  the  gain  variation  and  r  varies  overcome  relevant 
finite  interval  containing  zero.  This  produces 


amplifier.  This  reduces  the  number  of  state  variables  in  the  system 
model  without  changing  the  interconnected  structure  of  the 
system  which  is  represented  by  the  connection  matrices. 

IV.  An  Example 

Consider  the  system  configuration  of  Fig.  1.  There  are  two 
components  (subsystems)  as  indicated  by  the  roman  numerals  I 
and  II.  The  triangular  block  indicates  an  amplifier  whose  gain  is 
two.  For  convenience,  the  action  of  this  amplifier  is  reflected  in  the 
connection  equations. 

Suppose  component  I  has  the  following  state  model: 

*i  ■  -*i  +<Ji 

bxmxx.  (13) 


F(r)  m  A  +  B{I  -  LXXD  -  rPD)-'(Lx ,  +  rP)C.  (11) 

Computing  a  root  locus  via  a  continuations  approach  requires 
F(r).  which  can  be  derived  as 

P[r) 

«[/  -  LxxD-rPD]-lP[l 
+  D(l  —  LxlD  —  rPD)~l{Lxx  +  rPD)].  (12) 

Of  course  if  D  »  0.  F(r)  »  DPC.  which  is  constant,  sparse,  and 
of  low  rank.  Although  (12)  seems  formidible.  it  is  possible  to  view 
rPD  as  a  low-rank  perturbation  on  (/  -  LXXD)  and  use  House- 
holder's  formula  to  compute  the  necessary  inverses  during  each 
step  of  the  integration  of  (21 )— (23  V  Of  course,  stability  or  in¬ 
stability  is  known  simply  by  noting  the  position  of  the  terminal 
points  of  the  root  locus. 

In  accounting  for  effects  of  stray  capacitances'inductances,  the 
component  connection  model  is  superior  to  some  other 
approaches  [6],  [7],  [11],  Consider  that  the  A  matrix  in  (2)  is  block 
diagonal  (predominately  diagonal  for  circuits)  describing  only 
component  information.  Suppose  a  typical  entry  is  1/C.  character¬ 
izing  a  capacitance.  Clearly  cA  cC  is  zero,  except  for  the  particu¬ 
lar  C-dependent  diagonal  entry.  If  C0  is  the  nominal  C,  using 
(c4,tC)|c,  in  (21)  gives  the  approximation  (pa, /5C)|Co.  Consider¬ 
ing  AC(ca,,<?C)|o  for  each  i  gives  a  good  first-order  approxima¬ 
tion  to  direction  and  magnitude  of  eigenvalue  movement  relative 
to  small  perturbations  in  C. 

III.  Conclusions 

We  have  described  a  technique  for  determining  eigenvalue 
movement  of  a  system  in  terms  of  connection  information.  Several 
applications  were  described  It  is  especially  useful  for  determining 
what  components  give  nse  to  particular  interconnected  system 
eigenvalues.  This  in  turn  provides  a  means  of  reduced  order 
modeling  without  changing  the  system  structure.  Suppose  a  com¬ 
ponent  (as  identified  by  the  root  locus  technique)  gives  nse  to  a 
composite  system  eigenvalue  corresponding  to  slo »  mode  of  the 
system.  This  component  can  then  be  replaced  by  a  constant  gain 


Let  the  state  model  for  component  II  be  given  by 
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(14). 


Defining  x  -  [x,.  xi,  x;]\  a  »  [a,,  a),  ay]',  and  ft  -  [ft,’.  ft)-  ft’]', 
the  composite  component  state  model  is 


By  inspection  of  Fig.  1.  the  connection  equations  are 


'0  -1  -1  |  1  0‘ 

2  0  0  ;  0  0 
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© 

© 

o 

o 
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y 

0  1  0  i  0  0 

i 

u 

0  0  1  i  0  0 


(16) 


where  y«[.V|.  >  and  u  <■  [u,.  uj]'.  Equations  (15)  and  (16) 
constitute  the  complete  system  model— i.e..  the  component  con¬ 
nections  model  for  the  system  of  Fig.  1. 

Define  F(r)  as  per  (6).  in  which  case 

r-l  0  01  (-0  -r  —  rl 

F(r)m  I  0  -I  -2  +  2r  0  0  .  (17) 

L  0  2  -lj  Lo  0  oj 


The  derivative  of  F[r)  clearly  is 


Fig.  1  Plot  of  “root  locus." 

Implementing  the  root  locus  via  the  continuation  equations 
(21  )-(23)  results  in  the  root  locus  plotted  in  Fig.  2. 

Appendix 

Eigenvalue  Dynamics 

Let  F(r)  be  an  n  x  n  matrix  with  possibly  complex  entries 
depending  on  the  parameter  r.  Let  F(r)m  be  its  adjoint  matrix. 
Note  F(r)*  is  the  unique  matrix  satisfying 

<F(r)x.  y>  m  (x.  F(r)my)  (19) 

for  all  complex  n-vectors,  x  and  y.  where  ( ■ .  •  >  is  the  Euclidean 
inner  product  defined  as 


The  eigenvalues  of  F(r)  and  F(r)*  are  complex  conjugates  of  each 
other,  whereas  the  eigenvectors  e,(r)  and  e,(r)  are  not.  This 
theorem  is  motivated  by  [8],  where  (21)  is  implicitly  expressed  and 
the  necessary  tools  for  proving  the  theorem  are  made  clear. 

The  proof  of  the  theorem  can  be  found  in  [5],  [4],  A  derivation 
of  (21)  can  be  found  in  [7]  as  well  as  [8]. 
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where  y,  is  the  complex  conjugate  of  the  ith  entry  of  the  column 
vector.  The  essential  theorem  here  is  the  following. 

Theorem  I:  Let  F(r)  and  its  adjoint  F(r)*,  have  eigenvector 
trajectories  e,(r)  and  r,(r),  and  eigenvalue  trajectories  A,(r)  and 
J,(r).  respectively,  for  i  «  1.  2.  ■  •  ■ .  n.  Then  for  any  value  of  r  where 
the  eigenvalues  of  F(r)  are  all  distinct. 


idIti  tl\ 

\  dr  "  7 


i  -  1.  2, 
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de,m  ’  ’  dr  e"  '> 
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10.  Reprint  of  "A  Continuation  Algorithm  for  Sparse  Matrix  Inversion"  by 
R.  Saeks  from  the  IEEE  Proceedings,  Vol .  67,  pp.  682-683, 
(1970). 


A  Continuations  Algorithm  for  Sparse  Matrix  Inversion 
RICHARD  SAEKS 

In  the  various  algorithms  used  for  the  analysis  and  design  of  large- 
scale  circuits  and  systems,  the  problem  of  inverting  a  continuously 
parameterized  family  of  sparse  matrices  Af(r)  is  often  encountered 
1 1 J— [5] .  In  frequency  domain  analysis,  this  might  represent  a  transfer 
function  matrix  which  one  must  invert  over  a  specified  frequency  range 
[3]  while  in  time-domain  analysis,  such  an  Af(r)  arises  in  the  form  of 
the  Jacobian  matrix  for  the  system  equations  [1]  which  is  dependent 
on  some  potentially  variable  parameter  r.  Typically,  one  inverts  Mir)  at 
a  discrete  set  of  points  /•,-,/  ■  1,  2,  •  •  • ,  n;  using  a  sparse  matrix  algo¬ 
rithm.  Indeed,  the  more  efficient  algorithms  exploit  the  fact  that  the 
matrices  Af(r,-)  have  a  common  sparsity  structure  allowing  much  of  the 
computational  overhead  to  be  shared  by  then  inversions  (1). 

An  alternative  to  repeated  inversion  is  the  continuations  algorithm 
[5]  wherein  one  integrates  the  differential  equation 


Z(r)  -  -Z(r)  (dM/dr)  Z(r)  Z(0)*Af(0)“' 


(2) 


Xir)  Y(r)  -  27(0)  Z(0)  ♦  [Xiq)  Yiq) ]  dq 

'Q 


■/' 


>  JT(0)  T(0)  +  I  [%)  Yiq)  ♦  Xiq)  T(q)J  dq 
'o 


(1) 


to  obtain  M(r)~l  «  Zir).  While  the  integration  of  (1)  is  far  more  effi¬ 
cient  than  repeated  matrix  inversion  for  small  matrices,  it  fails  to  take 
advantage  of  the  sparseness  of  Mir),  thereby  rendering  the  technique 
inapplicable  in  a  large-scale  systems  context.  The  purpose  of  the  pres¬ 
ent  note  is  to  present  an  alternative  continuation  algorithm  which 
combines  the  LU  factorization  technique  of  sparse  matrix  inversion 
with.(l). 

Recall  the  standard  spars*  matrix  inversion  technique  [6]  wherein 
one  factors  a  matrix  into  the  form  Af  *  LU  where  L  is  lower  triangular 
and  £7  is  upper  triangular  with  ones  along  the  diagonal.  We  then  repre¬ 
sent  the  inverse  matrix  in  the  form  Af  *l  «  U~XL~X.  The  key  to  the 
technique  is  that  both  L  and  £7  and  their  inverses  will  be  sparse  if  Af  is 
sparse  (though,  in  general,  Af-1  is  not  sparse).  As  such,  one  may  store 
and  manipulate  the  inverse  of  a  sparse  matrix  via  its  sparse  upper  and 
lower  triangular  factors  £/”'  and  I"1,  even  though  the  inverse  matrix 
itself  is  nonsparse.  These  ideas  are  combined  with  the  continuation 
algorithm  concept  in  the  following  theorem  [7],  Here,  the  notation 
“  [Af  ]  is  used  to  denote  the  strictly  upper  triangular  matrix  obtained 
from  Af  by  setting  all  of  the  entries  of  Af  on  or  below  the  diagonal  to 
zero.  Similarly,  1  (Af ]  denotes  the  lower  triangular  matrix  obtained 
from  Af  by  setting  all  of  the  entries  above  the  diagonal  to  zero. 

Theorem:  Let  Xir )  and  Y(r)  be  solutions  of  the  matrix  differential 
equation 


J  l*1? 

J 0 

-  37(0)  Y( 0)  +  f  {-Jf(<7)"[  Y(q)(dMldq)  37(«7)(  Yiq) 

Jo 

-  X(q)llY(q)(dM/dq)  X{q))  Y(q)}dq 

-AT(0)T(0)+  j*  {-Xiq)[Y{q)idMldq)Xiq)\Y{q)}dq 
-'o 

-  X(Q)  Y{0)  *  f  {X(q)Y(q)\(dMldq)[X(q)Y(q)]dq.  (3) 

*'o 

Differentiation  of  both  sides  of  (3)  with  respect  to  r  then  results  in 

[X(r)  Yir)]  -  [37(0  Yir) J  (dM/dr)  [Xir)  T(r)].  (4) 

Finally,  a  comparison  of  (4)  and  (1)  reveals  that  37(0  Yir)  » Af(0“l 
since  both  X(r)  Yir)  and  Af(r)*1  satisfy  the  same  differential  equation. 
Consider  the  family  of  nia trices 


.j 

Here,  Af(0)  islower  triangular  and,  hence,  has  the  trivial  f.ff-factorization 

(6) 


X  »  -Xu[  YidMIdr) X],  AT(0)  »  £7(0) * 

Y  * -x[YidMldr)X]Y,  T(0)  -  LfO)*1. 

Then,  Xir)  *  U(r)~l  and  Yir)  -  L{r)~x  where  Af (/•)'*  *  U(r)~x L(r)~x 
is  the  LU  factored  form  of  Af(r)*‘.  Note,  if  Mir)  and  dM/dr  are  sparse 
then  every  matrix  involved  in  the  integration  of  (2)  will  be  sparse. 
Moreover,  the  i>**egration  may  be  carried  out  with  the  aid  only  of  a 
matrix  multiplication  aigorititm  plus  a  simple  procedure  tor  extracting 
the  upper  and  lower  triangular  submatrices  of  YidM/dr)X. 

Proof:  First,  we  observe  that  if  T(0)  is  lower  triangular,  then  Y  will 
be  lower  triangular  and  so  will  Yir)  for  all  Similarly,  if  X(0)  is  upper 
triangular  with  ones  on  the  diagonal,  then  X,  being  the  product  of  an 
upper  triangular  and  strictly  upper  triangular  matrix,  will  be  strictly 
upper  triangular.  As  such,  Xir)  will  be  upper  triangular  with  ones  on 
the  diagonal  for  all  r.  Thus  Xir)  and  Yir)  have  the  correct  form  and  it 
remains  to  verify  the  equality  Af(f)“*  «  Xir)  Yir).  Here 
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Now,  upon  using  an  Eulet  integration  formula  (Z(h)  «  Z(0)  ♦  JtZ(0)| , 
we  may  estimate  (/(0.1)*1  and  L(O.l)'1  via  the  equalities 

£7(6.1)“'  -  £7(0)*'  ♦  (0.1)  £/( 6)*' 

-  £7(0)“'  -  (0.1)  £/(0)*,"[f.(0)'lAf(0)  £/(0)_,I 


1  0]  fo  O.ll 

0  lj  "  [o  OJ 

1  -l/iol 

1  J 


(10) 
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and 


l(O.I)'1  -  1(0)*'  ♦  (0.1)  i(d)"1 

-  1(0)'*  -  'li(0)'lW0)  y(0)_,]i(0)*1 


J  1  O' 
"[9/10  9/lOJ* 


Multiplying  these  estimates  then  yields 

M( 0.1)"1 «  f/(0. l)~lUO.l)~' 

[91/100  -9/100l 
9/10  9/10  J 

which  complies  favorably  with  the  exact  inverse 


(11) 


(12) 


(13) 


The  error  here  is  due  to  the  approximation  inherent  in  the  numerical 
integration  process  and  can  be  reduced  by  use  of  a  more  accurate  inte¬ 
gration  procedure.  Of  course,  the  result  of  the  theorem  is  exact  and 
the  computed  value  for  M(,r)~l  will  be  as  accurate  as  the  integration 
process  employed. 
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Reprint  of  "Multiple  Solutions  of  a  Class  of  Nonlinear  Equations"  by 

C.T.  Pan  and  K.S.  Chao  from  the  Proceedings  of  the  1979  IEEE 
International  Symposium  on  Circuits  and  Systems,  Tokyo,  July 
1979,  pp.  577-580. 


A  search  method  is  presented  for  obtaining 
multiple  solutions  of  a  system  of  n  nonlinear 
equations  whose  first  (n-1)  equations  do  not  nec¬ 
essarily  define  a  unique  space  curve.  In  particu¬ 
lar,  the  approach  is  used  to  find  all  the  roots  of 
a  coaplex  polynomial.  Singularities  on  the  space 
curve  are  analyzed  and  properly  classified  accord¬ 
ing  to  their  high' order  derivatives.  Depending  on 
the  nature  of  singularities,  the  rules  for  a  sign 
change  in  the  algorithm  are  determined  so  that  the 
root-finding  procedure  can  be  continued. 


The  transition  in  sign  of  f  should  be  made  at  the 

n 

solution  points  and  points  where  the  Jacobian  de¬ 
terminant  changes  sign.  The  method  is  capable  of 
finding  all  solutions  provided  that  the  intersec¬ 
tion  is  a  simple  curve,  i.e.,  a  continuous,  differ¬ 
entiable  curve  which  does  not  intersect  itself. 

The  purpoee  of  this  paper  is  to  generalize 
the  above  method  to  cases  where  the  intersection  is 
multi-branched  and  may  indeed  intersect  itself. 

In  Section  II,  properties  of  nonlinear  equation 
with  multi-branched  and  intersecting  solution 
curves  are  discussed.  The  method  is  then  applied 
to  the  computation  of  all  the  roots  of  a  polynomial 
in  Sectfmn^III.  In  Section  IV,  the  results  ob¬ 
tained  are  illustrated  by  means  of  an  example. 


I.  INTRODUCTION 


An  important  problem  in  the  analysis  and  de¬ 
sign  of  non-linear  circuits  and  systems  is  the  de¬ 
termination  of  multiple  solutions  of  a  nonlinear 
equation 


In  section  I,  a  systematic  search  method  has 
been  outlined.  Success  in  finding  all  solutions 
depends  heavily  on  whether  or  not  1  defines  a  uni¬ 
que  simple  curve.  If  it  does  ,  then  a  complete 
traversal  of  it  enables  one  to  find  all  solutions. 

On  the  other  hand  if  t  is  multi -branched,  the 
application  of  the  method  may  lead  only  to  those 
solutions  that  lie  on  the  branch  containing  the 
starting  point.  As  will  be  seen  later,  multiple 
branches  do  exist  for  some  classes  of  functions. 
Therefore  in  order  to  find  ail  solutions  a  starting 
point  on  each  branch  of  1  must  be  initiated. 

On  a  continuous,  differentiable  solution  curve, 
t,  the  sign  change  of  the  method  is  indicated  by 

the  fact  that  the  directional  derivative  of  f  (x) 

n 

in  the  tangential  direction  of  1  changes  sign  if  and 
only  if  che  corresponding  Jacobian  of  f  on  1  changes 
sign  [4].  However,  difficulty  arises  when  1  does 
intersect  itself.  Since  the  directional  derivative 
is  not  defined  at  the  point  of  intersection,  the 
Jacobian  can  no  longer  be  used  for  judging  the  mono¬ 
tonicity  of  f  along  l  at  that  point.  The  situa¬ 
tion,  however,  can  be  described  by  the  following 
theorem. 


where  f  is  a  continuously  differentiable  function 

from  ?.n  into  itslef.  Several  methods  Cl]  -  [5] 
have  been  proposed  for  finding  multiple  solutions.- 
In  [4],  Chao,  Liu  and  Pan  developed  a  systematic 
search  method  for  solving  multiple  solutions  by 
numerical  integration  of  a  set  of  differential  equa¬ 
tions  of  the  fora 


Of/3x)  *(-f,  -f 


along  the  space  curve,  1,  defined  by  the  inter¬ 
section  of  the  solution  manifolds  for 
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Theorem  1: 


Let  t  (x)  :  RR  ■»  Rn,  n  >  2,  be  a  C1  function.  If 


7f  -  v  4  (t  .  ,L  ......  A  J1 

n  nl  n2  nn 


(5) 


where 


A  .  is  the  (ni)th  cofactor  of  3f/3x,  then 
ni 

|j!  -  I i v| I  2  >  0.  (6) 

Furthermore,  if  f£C‘,  then  f  also  satisfies  the 
Laplace's  equation 


matrix  has  previously  been  discussed  by  Branin  [6], 
[7].  Branin  also  described  a  procedure,  called 
the  method  of  signatures,  for  computing  all  the 
roots  of  a  polynomial.  However,  the  algorithm 
fails  in  cases  where  singular  points  do  exist  on 
the  search  trajectory.  In  the  following  section, 
a  systematic  method  for  obtaining  all  the  solutions 
of  a  complex  polynomial  with  complex  coefficients 
is  presented. 

III.  ROOTS  OF  A  POLYNOMIAL 
The  Basic  Algorithm 


b2l 

32f 

32f 

n 

_ a  * 

7T 

ffX, 

X. 

s  2  + 
3x2 

"•  +  7T 

bx 

n 

Proof:  Expansion  of  the  Jacobian  determinant  along 
the  nth  row  results  in 

det  J  -  l  *  -  |  |v|  |2  >_  0. 

i-1  xi  nl 

The  proof  of  the  second  part  although  complicated 
is  quite  straightforward  and  is  therefore  omitted. 

On  a  continuous,  differentiable  solution 
curve,  the  directional  derivative  is  given  by  [4] 

-  |J!/||v|!.  (8) 

Thus,  under  the  conditions  of  Theorem  1,  if  t  does  ' 
not  intersect  itself,  then  on  a  giver,  branch  of  i, 
the  directional  derivative  would  never  change  sign 
and  this  also  implies  that  there  exists  at  most  one 
solution  on  that  branch.  From  this  and  Theorem  1, 
the  following  cheorem  can  be  deduced. 

Theorem  2: 

n  n  1 

Let  f(x):  R  -  X  ,  n  >_  2,  be  a  C  function  and 

f  *  0  has  more  than  one  solution.  If  i,  defined 
ry 

f .  (xi  -  o,  i  »  r.-i, 

is  a  unique,  simple  curve,  then  7f  F  v. 

For  n  »  2,  condition  (5)  reduces  to 


In  view  of  the  foregoing  comments,  the  method 
for  obtaining  all  solutions  can  now  be  formulated 
for  the  nth  order  polynomial  equation 

g(i)  »  f^x^Xj)  ♦  jfjlXjXj)  ■  0.  (10) 

Application  of  the  method  described  in  Section  I  to 


•  0  leads  to 

«1  , 

ai - V 

*  („* 
l‘  10' 

• 

x1  )  »  0 

20 

df2  , 

IT  •  -l2' 

f2(X10' 

x^  )  •  f  , 

20;  20' 

(11) 

where  the  initial  conditions  are  such  that  the 
starting  point  x*  lies  on  or  close  to  each 
branch  2  of  1  defined  oy  f  «  0.  By  using  the 
chain-rule  of  differentiation  and  assuming  det 
J  F  0,  (10)  can  be  rewritten  as 


r  it. 


det  J 


Sf, 


_ C.  x  «  _ C 

1  3x,  -  ‘2  3x, 


3f „  '  If, 

t.  *  i.  ~ 

1  aXj  —  .-  :Xj  I 


x  c  l . .  i  ■  i,: 

O  1 


dc  L 

t-  «  c'  iz)  *  h.  <x,x,)  *  ih.(x,x,). 
dc  i  1  »  _  12 


(12) 


(13) 


An  expression  in  the  complex  domain  is  obtained  as 

i  *  — r— 1 — r-  C  (-f ,  h.  *  f^h.i  *  j  (f,h,*f  ,h, )  1 
•  11--*  12“** 

-  r.,) 


ccc*,  i  -  1,2 . n. 


(14) 


If  f.  and  f„  represent  both  the  real  part  and  the 
imaginary  part  of  an  analytic  function,  respective¬ 
ly,  then  the  condition  7;  «  v  in  the  twc-direr.s- 

n 

tonal  case  is  essentially  equivalent  to  the  Cauchy- 
riemann  conditions.  For  polynomials  of  a  complex 
variable,  the  relationship  between  the  Caucr.y-F-ie- 
.-.a.nn  conditions  and  the  determinant  of  the  Jacobian 


Further  simplif ication  by  using  the  notion  of  com¬ 
plex  conjugate (*)  leads  to  the  following  compact 


z  »  -c(z)/g'  (c!  , 
z  *  -g* (z) /g' (z) . 


if  the  minus  sicr.  is  chosen 


(15) 


if  the  plus  sign  is  chosen 


1,2, ... 


,n. 


The  ~  in  (14)  or  the  com? Lax  conjugate  sign  in 

( 13)  ~must  be  switched  at  the  solution  points  and  at 
the  a  van  singular  points  to  ba  defined  latar. 

'van  though  tha  algoritha  is  darivsd  from  f  it 
is  saan  Cron  (15)  that  analytic  axprassicns  for 
f2,  h^  and  h2  ara  not  required  explicitly .  Tha 
algoritha  is  thus  most  suited  for  finding  all  tha 
coots  of  a  complex  polynomial. 


Basis  for  tha  Sign  Change 

Oue  to  the  nature  of  isolated  singularities, 
the  existence  of  (n-U  such  points  in  tha  denomi¬ 
nator  of  (15)  does  not  pose  any  problem  to  tha  root¬ 
finding.  Tha  sign  change  at  tha  solution  points 
has  bean  discussed  in  [4]  and  it  will  not  be  re¬ 
peated  hare.  Tha  sign  change  at  certain  singular 
points  whara  tha  Jacobian  vanishes  can  ba  judged 
from  tha  following  theorem. 

Theorem  3:  Suppose  g(z)  is  an  analytic  function 
such  that 


In  the  formulation  of  tha  present  problem,  it 
is  assumed  that  f  (x)  »  He  g(t)  and  f  (x)  «  Xm  g(z) . 
Thus  the  root-searching  is  along  the  trajectory  1 
defined  by  f^  ■  0.  As  such,  the  previous  two 
theorems  are  stated  in  a  manner  compatible  to  this 
particular  formulation.  The  same  conclusion  can 
also  be  drawn  when  the  trajectory  of  f  ■  o  (Im 
g(z)  ■  0)  is  used  for  searching  all  the  solutions 
at  a  complex  polynomial.  A  theorem  similar  to  chat 
of  Theorem  3  has  been  proved  in  [8]  for  plotting 
the  root  locus  of  a  feedback  control  system. 


Points 


In  order  to  prevent  unnecessary  search  in  find¬ 
ing  multiple  solutions,  it  is  important  to  estimate 
the  upper  bound  inside  which  all  the  roots  of  a 
polynomial  will  lie.  Once  the  absolute  bound  is 
obtained,  the  search  can  then  be  confined  to  the 
circle  of  radius  M.  Several  methods  for  computing 
such  bounds  are  available  [9].  One  of  such  root 
bounds  derived  from  the  Gershgorin  circle  is  given 
in  the  following  theorem. 


g(w) 

“  ja,  . 

where  a  is  real  and  a  *  0 

(q) ,  . 

(w) 

»  0, 

q. »  1,2,  ....  a-1 

(m) 

1 '  (w) 

/  0, 

a  >  2 

at  some  point  w  located  on  Re  g(z)  -  0.  Let  R  • 

(ziRe  g  (z)  »  0}.  Then  in  the  neighborhood  of^w, 

?.  consists  of  a  branches  R  ,,  R  ...... R  and 

g  gl  g2  gm 

S  ,|1  A  -  -  *f| R__  «  {v }.  Furthermore  from  each  i, 
gl  gw  gm 

i  <  i  <  m.  Im  g  (z) ]r  .  is  either  a  local  maximum 
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or  a  local  minimum  at  w  if  m  is  even;  it  is  either 
an  increasing  or  a  decreasing  function  if  m  is  odd. 

For  convenience,  a  singular  point  in  Theroem 
3  is  called  an  even  singular  point  if  a  is  even; 
otherwise  it  is  said  to  be  an  odd  singular  point. 
The  criterion  for  the  sign  change  at  singularities 
follows  directly  from  the  above  theorem  and  is  for¬ 
mally  stated  in  the  following  theorem. 

Theorem  4:  If  g(z)  is  an  nth-order  complex  poly¬ 
nomial  and  w  is  not  a  solution  point  of  g(z)  >0, 
then  the  directional  derivative  of  Im  g(z)  in  the 
tangential  direction  of  1  defined  by  Re  g(z)  «  0 
changes  sign  when  passing  through  a  point  w  if  and 
only  if  w  is  an  even  singular  point. 

In  view  of  tha  previous  theorem,  it  is  clear 
that  the  sign  must  be  changed  when  passing  through 
an  even  singular  point  even  though  the  Jacobian 
does  not  change  its  sign.  For  an  odd  singular 
point  tha  sign  must  be  kept  unchanged  when  passing 
through  it  since  f  (x)  does  not  change  its  mono- 
tonicity.  Due  to  the  very  nature  of  an  odd  singu¬ 
lar  point,  it  is  clear  that  a  sign  change  at  such 
a  singular  point  will  cause  the  algorithm  to  oscil¬ 
late.  Since  there  are  only  two  possibilities, 
higher  order  derivative  tests  can  be  avoided  in  the 
actual  unp lamentation.  One  can  always  initiate  a 
sign  change  whenever  a  singular  poinc  is  passed. 

If  oscillation  results,  one  should  proceed  without 
any  sign  change. 


Theorem  5;  £f  x*  is  a  solution  of  f(x)  «  0  where 

f  -  <fx,  t2‘  •  e\  “  86  <3(z)  e2  “  Ia  th,n 

I  1**1  i  <.  M  -  max{|an|  ,  1+^1 ,  k-1,2, . . .  ,n-l}  (16) 

where  a  '3  are  the  coefficients  of  the  correspond¬ 
ing  monic  polynomial 

g(z)  *z"t  a,*11”1  t  ...  +  a  z  *■  a  .  (17) 

■L  n—  l  n 

As  pointed  out  previously,  the  algorithm  re¬ 
quires  n  starting  points  located  on  or  close  to 
each  branclT  of  1  defined  by  f,  »  0.  This  can  be 
accomplished  with  the  aid  of  Theorem  5  and  the 
properties  of  a  polynomial  for  large  z.  It  is 
easily  shown  that  for  large  z,  or  |z|  »  M,  the 
trajectories  of  f  »  0  approach  straight  lines  with 
constant  phases 

3k  »  ^  »  k  «  1,3,5 . 4n-l  (18) 

and  the  asymptotes  for  the  nth-order  monic  poly¬ 
nomial  (17)  intersect  at  the  centroid 


The  starting  point  can  now  be  obtained  from  (18) 
and  (19)  as 

z*  «  z„  +  Re^k,  k  »  1,3,5, ...  ,4n-l  (20) 

o  c 

for  an  arbitrary  R  >>  M. 

Although  there  are  2n  points  in  (20),  only 
half  of  them  are  used.  The  other  half  are  just 
the  end  points  of  each  trajectory  and  they  can 
easily  be  identified  by  checking  their  phases.  The 
choice  of  R  depends  on  the  accuracy  required  for 
the  starting  point.  Since  the  root  bound  M  is 
known,  R  need  not  be  much  larger  than  M.  A  point  on 
1  can. usually  be  obtained  quite  accurately  within 
a  few  steps  from  a  rough  estimate  of  (20)  by  using 
the  Mewton  iteration 
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m 


*k+l  “  \  "  R<  9(*)/g'(*>  (21) 

IV.  NUMERICAL  examfle 

In  this  section  an  example  is  presented  to 
illustrate  the  proposed  algorithm.  For  simplicity 
only  the  Euler's  method  is  used  for  illustration. 

In  practice,  more  efficient  integration  techniques 
may  be  used  to  integrate  the  proposed  equations. 

Example: 

A  third  order  polynomial  equation 

g(z)  «  z3  +  3  z2  *  28  z  +  26  «  0 

is  considered  where  ail  the  singular  points  are 
located  on  one  branch  of  Re  g(z)  »  0.  The  trajec¬ 
tories  of  both  Re  g(z)  »  0  and  Im  g(z)  »  g  are 
shown  in  the  Figure.  Trajectories  of  f^  *  Re  g(z) 

•  0  are  used  to  search  for  the  solutions  with  the 
bound,  M  •  29  the  canteroid,  z  ■  -1  and  R  »  50. 
Tracing  along  4  ,  results  in  three  solutions  z  » 

-1  ♦  J  5,  •  -I  and  Zj  •  -1  +  J  5  while  the  ether 

two  branches,  4,,  and  4,,,  contain  no  solution  at 
all. 

V.  CONCLUSION 

A  systematic  search  method  has  been  developed 
for  computing  multiple  solutions  of  a  nonlinear 
equation  with  multi-branched  solution  curves.  In 
particular,  the  approach  has  been  applied  to  a 
class  of  functions  derived  from  both  the  real  part 
and  the  imaginary  part  of  a  complex  polynomial. 

It  turns  cut  that  the  algorithm,  expressed  in  its 
complex  form,  is  most  suited  for  finding  all  the 
foots  of  a  polynomial.  Analytic  expressions  for 
both  the  real  and  the  imaginary  parts  of  a  poly¬ 
nomial  are  not  required  explicitly.  The  key  to  the 
continuation  of  the  root-finding  procedure  at  the 
singularities  on  the  solution  curve  is  the  sign 
change  associated  with  the  numerical  algorithm.  It 
is  shown  that  the  sign  must  be  changed  whenever  ar. 
even  singular  point  is  encountered  along  the  search 
trajectory  and  no  such  change  is  allowed  when  pass¬ 
ing  through  an  odd  si.ncular  point.  Although  the 
method  is  formulated  in  such  a  way  as  to  *ind  all 
solutions  for  a  class  of  functions  from  R*  into 
itself,  it  is  conceivable  that  the  approach  may  be 
generalized  to  higher-dimensional  functions  satisfy¬ 
ing  the  conditions  as  described  in  Theorem  1  pro¬ 
vided  that  a  starting  point  for  each  branch  of  the 
solution  curve  car.  be  determined. 
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Abstract 

A  continuation  method  is  presented  for  finding  all  the  roots  of  a 
polynomial.  Each  root  is  obtained  systematically  by  numerical  inte¬ 
gration.  Selection  of  starting  points  and  the  existence  of  singular 
points  are  discussed.  Moreover,  transformations  may  be  applied  to 
reduce  the  computational  effort 


1.  INTRODUCTION 

The  solution  for  the  roots  of  real  or  com¬ 
plex  polynomials  is  a  fundamental  require¬ 
ment  in  many  areas  of  applied  mathematics 
as  well  as  in  engineering.  This  is  par¬ 
ticularly  so  for  the  application  of  Lap¬ 
lace  transform  theory  to  linear  autonomous 
systems.  Although  it  is  well-known  that 
an  nth -order  polynomial  has  exactly  n  roots 
with  multiplicity  counted,  the  evaluation 
of  all  the  roots  is  not  at  all  a  simple 
task.  The  difficulty  is  primarily  due  to 
its  nonlinear  nature. 

The  objective  of  this  paper  is  to  propose 
a  systematic  approach  for  finding  all  the 
roots  of  a  polynomial.  The  algorithm  is 
based  on  continuation  methods  [1]  -  [4]. 

The  original  polynomial  is  first  embeded 
into  a  new  equation  by  introducing  a  para¬ 
meter  r.  This  will  result  in  n-branch  root 
loci  as  r  varies  continuously.  The  root 
finding  procedure  then  becomes  a  matter  of 
tracing  along  these  loci  up  to  the  desired 
roots. • 

2.  THE  CONTINUATION  METHOD 
Consider  the  problem  of  finding  all  the 


roots  of  the  polynomial  equation 
P(s)  -  sn  +  a^"-1  +  a2sn“2  +  ... 

+  an-l*  +  an  “  0  (1) 

where  s  is  a  complex  variable  and  a^s  are 
complex  coefficients.  It  is  well  known  [1] , 
[2]  that  such  a  problem  can  be  solved  by 
using  continuation  methods.  For  example, 
the  roots  of  (1)  can  be  obtained  by  solv¬ 
ing  the  continuation  equation  [1] 

F{s,r)  -  (l-r)O(s)  +  rP (s)  -  0  (2) 

where  r  is  a  positive  real  number  and  Q(s) 
is  any  nth-order  polynomial  whose  n  roots 
are  known.  It  is  seen  that  when  r  ■  1,  (2) 
reduces  to  (1),  while  for  r  «  0  (2)  becomes 

F(s,0)  -  Q(s)  -  0  (3) 

whose  n  roots  are  already  known.  Thus  as  r 
varies  from  zero  to  one  continuously  the 
trajectories  of  these  roots  comprise  n 
branches  of  root  loci.  Each  locus  starts 
from  a  known  root  of  Q(s)  and  terminates  on 
a  desired  root  of  P(s).  Therefore  by  trac¬ 
ing  along  these  trajectories  all  roots  of 
(1)  can  be  located. 

The  advantage  of  this  type  of  formulation 


lies  in  the  fact  that  ona  can  easily  obtain 
tha  approximate  roots  provided  there  exists 
no  singular  point  on  the  root  trajectory. 

Xt  can  therefore  be  used  as  a  means  for  ob¬ 
taining  sufficiently  close  initial  guesses 
for  other  methods  which  have  rapid  rate  of 
convergence  such  as  Newton's  method.  How¬ 
ever,  for  any  arbitrary  polynomial  equation 
P(s)  ■  0  and  a  given  polynomial  Q(a) ,  there 
is  no  guarantee  that  the  root  loci  will  not 
intersect  each  other.  Unless  singular 
points  can  be  handled  properly,  one  would 
have  to  try  a  different  Q(s) .  In  what 
follows,  an  algorithm  given  in  [4]  for  com¬ 
puting  root-locus  plot,  and  hence  the  solu¬ 
tion  of  (2)  is  described.  The  problem  of 
singularities  on  the  trajectory  will  also 
be  discussed. 

Consider  the  set  of  differential  equations 
s(t),r(t))  -  -F(s(t) ,r(t) ) , 

F (s (0) ,r (0) )  -  0 

|^r(t)  -  1,  r (0)  -  0  (4) 


where  t  is  a  dummy  variable.  Application 
of  the  chain  rule  to  (4)  yields 


ds  F ( s , r) + ( 3F/3r ) 
dt  ”  3F/3s  ' 


s 


o 


||  -  1,  r (0)  -  0  (5) 

where  a  is  a  root  of  Q(s) .  Equivalently, 
o 

(5)  can  be  rewritten  as 


plot  for  0<t<l  which  contains  n  branches. 
On  the  root  locus,  it  is  possible  that  the 
denominator  of  (7)  may  become  zero.  A 
point  s*  such  that 

D(s)  A  (l-r)Q' (s*)+rP* (s*)  -  0,  (8) 


is  called  a  singular  point.  From  (8)  it  is 
obvious  that  there  can  be  at  most  (n-1) 
isolated  singular  points  located  on  the  tra 
jectories.  This  corresponds  to  the  situa¬ 
tion  that  more  than  one  root  loci  intersect 
at  s*.  Singular  points  can  be  classified 
according  to  their  higher  order  derivatives 
An  qth-order  singular  point  is  defined  as 
a  singular  point  such  that 


-  0, 

s»s* 

t  0, 

s=*s» 


k  «  1,2,.. .q-1 


2 <q<n . 


(9) 


Depending  on  whether  q  is  odd  or  even,  sin¬ 
gular  points  can  further  be  classified  as 
odd  or  even  singular  points.  Xt  is  shown 
in  (4]  that  whenever  an  odd  singular  point 
is  encountered  on  the  trajectory,  it  is 
necessary  to  jump  over  it  by  adding  a  small 
variation  |4s|  along  the  tangential  direc¬ 
tion  of  the  locus;  for  an  even  singular 
point,  the  direction  of  the  locus  at  s* 
must  be  changed  by  a  vector 

4z  *  4s  e_jlT/q  (10) 


ds  _  -rQ  (s)  +  ( 1+r)  P  (s) 
dt  (1-r) Q 1 (s) +rP 1 (s) 


s(0)  -  sQ 


dr 

at 


r ( 0)  -  0  (6) 


or 


ds  -tQ(s)+(l+t)P(s) 
at  " ( i-t) q  (si ♦tY"r (s) 


s(0)  -  sQ 
te (0 , 1] 


(7) 


where  Q'  and  P*  denote  the  derivatives  of 
Q(s)  and  P(s)  with  respect  to  s,  respec¬ 
tively.  Equation  (7)  can  now  be  integrated 
by  using  any  numerical  technique  until  t*l 
is  reached.  This  will  result  a  root-locus 


where  4s  is  a  sufficiently  small  variation 
in  the  tangential  direction  of  the  locus 
when  approaching  s*. 

3 .  STARTING  ROOTS 

The  proposed  method  requires  n  starting 
roots  of  Q(s)  *  0.  Theoretically  Q(s)  can 
be  any  nth-order  polynomial  so  long  as  its 
roots  are  known.  For  practical  purpose, 
Q(s)  should  be  as  simple  as  possible.  The 
choice  of 

Q (s)  -  sn  -  Mn  (11) 

is  made  where  M  is  assumed  to  be  positive 
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and  real.  The  n  corresponding  roots  are- 

s^  *  M  exp ( j 2 r k/ r)  t  k  *  0|1|2|*«> f n*l« 

(12) 

In  order  to  reduce  unnecessary  computational 
effort  M  should  be  chosen  properly.  Hence 
it  is  important  to  estimate  the  bound  in¬ 
side  which  all  the  roots  may  lie.  One  such 
bound  is  given  in  the  following  theorem  [5]. 
Theorem.  Let 

h(s)  -  sn  +  ♦•••+*n>iS  +  an 

be  a  monic  polynomial.  If  s*  is  a  root  of 
h(s)  ■  0,  then  |s*|  £  N  where 

N  -  max{|an|,  1  +  | |  ,  j  -  1,2, ...n-1}. 

From  the  above  theorem,  it  is  perhaps  logi¬ 
cal  that  M  should  not  be  chosen  to  be 
greater  than  N.  In  view  of  the  fact  that 
N  is  usually  much  larger  than  the  least 
upper  bound  for  the  roots,  two  transforma¬ 
tions  given  in  [6]  may  be  used  to  modify 
(2) .  Using  the  transformation 

y  -  3  ♦  (13) 

equation  (2)  reduces  to 

yn  ♦  b2yn~2  +  ...  +  bn  -  0.  (14) 

A  second  transformation 

y  *  n/(bn)  z  (15) 

is  then  used  to  convert  (14)  into  a  poly¬ 
nomial  equation  in  z.  The  roots  in  the 
transformed  z-plane  are  more  uniformly  dis¬ 
tributed,  and  as  a  result,  the  computa¬ 
tional  effort  may  be  reduced  considerably. 

4 .  EXAMPLE 

As  an  illustration  of  the  approach  pre¬ 
sented,  consider  the  polynomial  equation 

P(s>  -  s3+ (1-j 3) s2+ (23+j32) s+ (-37-jl85) 

which  has  three  known  roots  at  s^-3* j2,  s2* 
-5+j7  and  s3  »  l-j6.  The  root  bound  N-188.7 
Application  of  (7)  and  (12)  using  M»5  along 


with  a  fourth  order  Range  Kutta  method  with 
a  step  size  of  0.005  results  in  three  roots 

sj  -  2.997  +  j2. 001 
s|  -  -5.001  +  j6.998 
s*  -  1.005  -  j6.000. 

From  this  example  it  is  obvious  that  N  is 
much  larger  than  the  least  upper  bound  for 
the  roots.  After  using  transformations 
(13)  and  (15) ,  the  normalized  equation  be¬ 
comes 

z3+(0. 780729+ jl. 03421) z 

-(0.4169435+j0. 908944)  -  0. 

It  is  seen  that  the  new  root  bound  N’«2.296 
Application  of  the  proposed  method  yields 
three  roots  in  the  z-plane 

z1  -  0.58136  +  jO. 17441 
z2  *  -0.8139  +  jl. 04643 
z3  «  0.23255  -  jl. 22085 

where  the  same  integration  technique  with 
a  step  size  of  0.02  is  used.  Transforming 
back  to  the  s-plane  gives 

s^  «  2.99999  +  j2. 00000 
s2  -  -5  +  j6. 99994 
s 3  ■  1.00004  -  j5. 99996. 

The  root  loci  in  both  the  s-plane  and  the  z 
plane  are  shown  in  Figs.  1  and  2,  respec¬ 
tively. 

5.  CONCLUSION 

A  continuation  method  is  presented  for  find 
ing  all  the  roots  of  a  polynomial.  Singu¬ 
lar  points  along  the  trajectory  can  be  prop 
erly  handled  so  that  the  root-finding  pro¬ 
cedure  can  be  continued  for  any  given  im¬ 
bedding  polynomial. 
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A  continuations  algorithm  for  tracking  the  eigenvalues  of  a  large  sparse 
system  of  equations  is  presented.  This  is  achieved  by  constructing  a  family 
of  similarity  transformations  which  triangularize  the  given  system  as  a  function 
of  the  underlying  parameter.  Since  both  the  resultant  triangular  matrix  and  the 
similarity  transformations  themselves  retain  the  sparseness  of  the  given  system 
of  equations,  the  resultant  algorithm  proves  to  be  quite  efficient  when  applied 
to  our  large  scale  system  problems. 
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The  goal  of  the  work  unit  is  the  application  of  the  techniques  of  the 
theory  of  functions  in  several  complex  variables  to  several  problems  in 
circuit  and  system  theory  which  are  modeled  by  rational  functions  in  two  or 
more  complex  variables.  Possibly  the  most  important  of  these  is  the 
analysis  and  design  problem  for  multidimensional  digital  filters  in  which  a 
multidimensional  z-transform  is  employed.  The  investigation,  however,  also 
includes  a  study  of  the  stability  problem  for  mixed  lumped/transmission  line 
systems  and  a  study  of  the  multivariable  passive  synthesis  problem. 

Our  major  activity  during  the  past  year  has  been  an  investigation  of  the 
design  problem  for  two-dimensional  digital  image  processing  filters.  Since 
the  filter  design  problem  has  historically  been  inextricably  intermingled  with 
the  spectral  factorization  problem,  this  study  began  with  an  investigation  of 
the  fundamental  limitations  on  the  existence  of  spectral  factors  for  two- 
dimensional  transfer  functions.  Since  the  resulting  conditions  for  the  exist¬ 
ence  of  a  quarter-plane  stable  spectral  factorization  proved  to  be  extremely 
stringent,  we  turned  our  attention  to  the  design  of  half-plane  stable  digital 
filters.  These  filters  have  far  less  stringent  existence  conditions  and  we 
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are  presently  developing  a  general  purpose  design  procedure  for  such  filters. 

In  an  effort  to  alleviate  the  need  for  artificially  imposing  any 
"causality"  structure  on  the  image  processing  problem,  we  have  also  initiated 
a  study  of  the  class  of  periodically  varying  discrete-time  systems  which 
naturally  model  the  actual  scanning  process  used  in  a  "real  world"  image 
processing  system.  Although  these  systems  are  time-varying,  they  represent 
the  only  known  class  of  time-varying  systems  which  admit  a  "viable"  frequency 
domain  theory.  As  such,  we  believe  that  it  will  be  possible  to  formulate  a 
viable  frequency  domain  theory  for  two-dimensional  image  processing  filters 
in  terms  of  the  physical  scanning  process  actually  employed,  thereby  alleviat¬ 
ing  many  of  the  difficulties  hitherto  encountered  in  two-dimensional  filtering 
theory,  which  are  actually  due  to  the  artificiality  of  the  model  rather  than 
the  physical  problem. 
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varying  discrete  time-systems  and  some  of  its  generalizations. 

7.  Publications  and  Activities: 

A.  Refereed  Journal  Articles 

1.  Murray,  J.,  "Spectral  Factorization  and  Quarter-Plane  Digital 
Filters",  IEEE  Trans,  on  Circuits  and  Systems,  Vol .  CAS-25, 
pp.  586-592,  (1978). 

B.  Conference  Papers  and  Abstracts 

1.  Murray,  J.,  "Semidirect  Products  and  the  Stability  of  Time- 
Varying  Systems",  Proc.  of  the  Inter.  Symp.  on  the  Mathematics 
of  Networks  and  Systems,  Vol.  3,  T.H.  Delft,  Delft,  July  1979, 
pp.  121-125. 


124 


I 

3 


3 


2.  Saeks,  R.,  and  J.  Murray,  "Stability  and  Homotopy  II",  Proc. 
of  the  1979  Joint  Automatic  Control  Conf.,  Denver,  June  1979, 
p.  358,  (abstract  only). 

C.  Conferences  and  Symposia 

1.  Murray,  J.,  12th  Asilomar  Conf.  on  Circuits,  Systems,  and 
Computers,  Pacific  Grove,  Ca.,  Nov.  1978. 

2.  Murray,  J.,  1979  Inter.  Symp.  on  the  Mathematics  of  Networks 
and  Systems,  T.H.  Delft,  Delft,  July  1979. 

3.  Murray,  J.,  1979  Joint  Automatic  Control  Conf.,  Denver,  June 
1979. 


X 


t 


X 


X 


t: 


c 


125 


8.  Reprint  of  "Spectral  Factorization  and  Quarter-Plane  Digital  Filters’*  by 
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Spectral  Factorization  and  Quarter-Plane 

Digital  Filters 

JOHN  J.  MURRAY 


Ahiuact — T»o  sets  of  neasxar}  conditions  are  derived  for  the  ex¬ 
istence  of  a  rational  spectral  faciorizatioo  of  a  given  rational  function  of 
two  complex  variables:  partial  converses  of  these  results  are  jit  ecu  and  the 
implications  of  these  conditions  for  the  design  of  minimum-phase  FIR 
filters  and  stable  1IR  fillers  arc  discussed.  In  particular,  it  is  shown  that 
these  conditions  are  closely  related  to  the  difficulties  encountered  in  the 
stabilization  problem  for  two-dimensional  DR  filters. 

Introduction 

HE  SUBJECT  of  two-dimensional  digital  filters  has 
received  considerable  attention  of  late:  in  particular, 
two-dimensional  spectral  factorization  has  been  treated  in 
a  number  of  papers — it  is  considered  in  great  detail  in  [I]. 
Tne  major  problem  which  arises  is  that,  in  general,  the 
spectral  factors  of  a  rational  transfer  function  are  not 
rational:  some  further'processing.  such  as  truncation  and 
smoothing,  is  usually  employed  to  yield  approximate  ra¬ 
tional  factors.  It  is.  therefore,  spmewhat  surprising  that 
the  class  of  rational  functions  for  which  a  rational  spectral 
factorization  exists  does  not  seem  to  have  been  investi¬ 
gated.  In  this  paper,  we  give  two  sets  of  conditions  which 
must  be  satisfied  by  such  functions  (Theorems  1  and  3);  a 
converse  is  given  which  may  be  applied  to  the  numerator 
and  denominator  polynomials  separately.  Now.  the  poly¬ 
nomial  spectral  factors  (when  they  exist)  of  a  given  poly¬ 
nomial  are  minimum-  and  maximum-phase  polynomials; 
conversely,  every  such  polynomial  gives  rise  to  trivial 
spectra!  factors.  Motivated  by  inis,  we  apply  the  results  of 
Theorems  1  and  3  to  the  particular  case  of  minimum- 
phase  polynomials  (i.e..  polynomials  without  zeros  in  the 
unit  poly  disc). 

In  this  context,  the  main  consequences  of  the  results  of 
this  paper  may  be  broadly  outlined  as  follows: 

i>  A  given  polynomial  has  exactly  the  same  amplitude 
response  as  a  minimum-phase  polynomial  if  and  only  if 
the  classical  one-variable  method  (of  factoring  the  original 
polynomial  imo  a  product  of  two  polynomials  devoid  of 
zeros  in  certain  regions)  can  be  applied.  (This  result  is  in 
fact  implicit  in  [1].  but  does  not  appear  to  h3\e  been 
explicitly  stated  in  the  literature.)  Tne  corresponding 
statement  for  minimum-phase  stable  rational  functions  is 
false,  however. 
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ii)  If  the  conditions  given  in  Theorems  1  and/or  3  are 
not  satisfied,  then  not  only  is  there  no  minimum-phase, 
stable  rational  function  having  exactly  the  same  amplitude 
response,  as  the  original,  but  the  original  amplitude  re¬ 
sponse  can  not  even  be  approximated  arbitrarily  well  by 
minimum-phase  stable  rational  functions.  This  follow-s 
from  the  fact  that  the  conditions  in  Theorems  1  and  3  are 
conditions  on  the  amplitude  response  which  are  preserved 
under  any  reasonable  kind  of  conv  ergence. 

iii)  The  conditions  in  Theorem  3  are  easily  visualized 
and  surprisingly  stringent:  they  require  essentially  that  the 
gain  of  the  filter,  averaged  over  certain  directions  in  the 
frequency  plane,  have  no  variation  in  a  perpendicular 
direction.  (See  the  discussion  following  Tneorem  3.)  This 
gives  extremely  severe  restrictions  on  the  amplitude  re¬ 
sponse  of  minimum-phase  FIR  filters,  minimum-phase 
stable  HR  filters,  and  the  denominator  polynomial  of 
arbitrary  stable  HR  filters. 

iv)  It  has  been  pointed  out  by  Bose  [9]  and  Woods  [10]. 
and  again  is  implicit  in  [1].  that  there  exist  purely  recur¬ 
sive  fillers  whose  amplitude  responses  are  not  realizable  as 
the  amplitude  response  of  any  stable  purely  recursive 
filter,  and  thai  consequently  any  stabilization  method 
which  attempts  to  match  the  amplitude  response  of  the 
original  filter  is  doomed  to  failure.  The  restrictions  re¬ 
ferred  to  in  iii)  reinforce  this  conclusion  and  identify  the 
precise  properties  of  the  examples  in  [9]  and  [10]  which 
make  stabilization  impossible. 

Definitions  and  Sotation 

Our  notation  will  follow  that  used  in  [2]:  we  repeat  it 
here  for  convenience.  For  simplicity  we  restrict  nurxeKes 
throughout  to  two  dimensions,  although  there  does  not 
appear  to  be  any  difficulty  in  extending  the  results  to 
higher  dimensions.  Thus  all  functions  are  assumed 
throughout  to  be  rational  functions  of  two  complex  vari¬ 
ables  unless  otherw  ise  stated,  we  further  exclude  the  zero 
function.  Two-dimensional  complex  space  will  be  denoted 
by  C:.  t.e..  C:”(t2,.Z;)  Z,  and  Z:  are  complex  num¬ 
bers].  Tic  open  unit  poly  disc  will  be  denoted  by  l:.  i.e.. 

U:-{(Z1.Z,iSCJ||Zli<l  and  JZ,[ <  1 } 
and  its  closure  will  he  denoted  by  p: 

P«  (tZ.  Z^eC^IZ.A  1  and  |Z:!  C  l). 
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The  distinguished  boundary  of  the  unit  polvdisc  will  be 
denoted  by  T2: 

7-2-{(Z1,ZI)e<rJ||Z1|«landiZ2|-l}.- 

The  frequency  response  of  the  filter  whose  transfer 
function  is/iZ^Z^  is  simply  the  restriction  of  /  to  T2.  We 
will  find  it  convenient  to  denote  this  restriction  by/*. 

The  one-dimensional  sets  corresponding  to  the  above 
are 

f/-{ze<fi|Z|<i} 

£7-{ze<Tl|Z|<i} 

r-{ze<f||Z|-i}. 

We  need  one  further  subset  of  tf2: 

V1  -  {(Z„Z2)  £  <f2|  |Z,|  >  1  and  \Z2\  >  1 }. 

By  the  Fourier  coefficients  of  a  function  h  de¬ 

fined  on  T2  we  mean  the  numbers 

C* d92. 

4r*  Jo  Jo 

Finally,  let  us  state  precisely  what  we  mean  by  the  term 
spectral  factorization.  Several  different  forms  of  spectral 
factorization  are  treated  in  P];  here  we  will  be  concerned 
only  with  the  simplest  form:  if  /  is  a  rational  function,  it 
will  be  said  to  have  a  (rational,  quarter-plane)  spectral 
factorization  if  /»/,/,.  where  /,  and/:  are  rational  func¬ 
tions./,  has  no  poles  or  zeros  in  U2.  and  A  has  no  poles  or 
zeros  in  V2.  Several  comments  are  in  order  concerning 
this  definition: 

i)  By  “rational”  we  mean  only  “finite-order:”  i.e.,  the 
functions  are  assumed  to  bt  expressible  as  the  quotient  of 
two  (finite-order)  polynomials. 

ii)  The  quarter-plane  property  enters  only  in  connec¬ 
tion  with  the  regions  in  which  the  factors  are  assumed  to 
be  zero-  and  pole-free:  in  particular,  if  /  has  no  poles  or 
indeterminactes  on  T2.  and  has  a  quarter-plane  spectral 
factorization,  then  there  is  a  quarter-plane  causal,  stable 
filter  whose  amplitude  response  is  equal  to  ,/*!- 

iii) _It  would  possibly  be  more  natural  to  work  with  U * 
and  l'2  rather  than  L':  and  V2  (especially  when  consider¬ 
ing  stability).  However,  to  do  so  would  complicate  the 
statements  of  the  theorems  considerably,  and  it  is  usually 
clear  whether  or  not  the  results  will  hold  with  U2  and  F2 
in  place  of  U2  and  V2.  (One  needs  only  to  check  for  zeros 
and  poles  on  T2).  In  general,  if  the  “closed”  version  is  not 
obvious,  it  is  not  true:  1  -  Z,  Z;  will  serve  as  a  counterex¬ 
ample  in  all  such  cases. 

iv)  To  simplify  the  statements  of  the  theorems,  the 
definition  has  been  given  in  terms  of  the  rational  function 
/  itself,  rather  than  the  spectral  function  |/*j2;  however, 
the  conditions  stiven  in  the  theorems  actuallv  involve  onlv 

,r\2- 

v)  We  note  that  V 2  is  defined  to  be  2  subset  of  C::  thus 
the  behavior  of  functions  at  infinity  is  irrelevant  to  our 
purposes. 


Spectral  Factorization 

Our  first  criterion  for  the  existence  of  rational  spectral 
factors  is  very  much  in  the  spirit  in  which  spectral  factori¬ 
zation  is  treated  in  [1];  it  is  a  trivial  consequence  of 
Theorem  5.4.7  in  [2]. 

Theorem  I 

If  a  rational  function  /  on  <f2  has  a  rational  spectral 
factorization,  then  the  Fourier  coefficients  a^  of  log  j/*| 
are  zero  for  all  pairs  of  integers  (m,/t)  such  that  wi#  0, 
n  9*0,  and  m  and  n  have  different  signs — that  is,  for  all 
integer  points  in  the  second  and  fourth  quadrants.  The 
converse  is  true  for  polynomial  /. 

As  mentioned  above,  this  criterion  involves  only  the 
absolute  value  of  /:  it  follows  that  the  existence  of  spectral 
factors  imposes  restrictions  on  the  amplitude  response  of 
a  two-dimensional  filter — in  contrast  with  the  situation  in 
one  dimension.  The  above  criterion,  however,  does  not 
present  these  restrictions  in  an  easily  visualized  form.  For 
instance,  it  is  difficult  to  gauge  exactly  how  severe  the 
restrictions  are.  For  this  reason,  we  next  present  condi¬ 
tions  which  are  stated  in  terms  of  the  log-amplitude  re¬ 
sponse  itself,  rather  than  its  Fourier  coefficients.  This 
result  takes  an  approach  which  seems  to  differ  substan¬ 
tially  from  those  previously  known:  it  gives  easily  visua¬ 
lized  necessary  conditions  on  those  rational  functions 
which  admit  a  rational  spectral  factorization.  Before  we 
state  this  theorem,  however,  we  first  present  a  simple 
result  which  will  be  used  in  the  proof,  and  is  also  of 
separate  interest:  one  of  its  consequences  is  that  when 
rational  spectral  factors  exist,  the  usual  one-dimensional 
stabilization  method  (for  unstable  denominator  polynomi¬ 
als)  can  be  used. 

Theorem  2 

If  the  rational  function  /  admits  a  rational  spectral 
factorization,  then  there  is  a  rational  function  / (with  deg 
/ <  deg/)  such  that 

\r\~\r\ 

and  /  has  no  poles  or  zeros  in  U2. 

Again,  the  converse  holds  for  pol>  normal  /. 

Thus  if  the  denominator  polynomial  of  an  unstable 
filter  has  polynomial  spectral  factors,  there  is  a  stabie 
filter  of  at  most  the  same  order  with  the  same  amplitude 
response  (provided  the  polynomial  has  no  zeros  on  T2). 

Again,  most  of  the  proof  is  contained  in  [2];  we  fill  in 
the  details  here:  suppose  /  has  rational  spectral  factors, 
then  fmf\P/Q.  where /,  has  no  poles  or  zeros  in  U2.  and 
P  and  Q  are  polynomials  without  zeros  in  1/2.  Let 

p-z?z;P{\/z}.\/z2).  Z2=±0.  Z,*0 

where  m  is  the  degree  of  P  in  Z-.  n  is  the  degree  of  P  in 
Z;.  and  P  is  the  polynomial  whose  coefficients  are  the 
complex  conjugates  of  the  coefficients  of  P.  Clearly  P  is  a 
polynomial  of  degree  less  than  or  equal  to  the  degree  of  P. 
and  so  is  also  defined  for  Z,m  0  and  Z:“0.  Now  if 


Tflft 


IEE£  TRANSACTIONS  On  CIRCUITS  AND  SYSTEMS.  VOL.  CAS- 25  NO.  8.  AUGUST  I^S 


PiZt.Z2)m  0,  for  Z,*0and  Z2»  0:  then  F(l/Z,.  l/Z.)* 
0;  this  implies  that  either 

\\/Z{\  <  1  or  |l/Zj|<  I  (since  P  has  no  zeros  in  V2) 
and  so  either 

|Z,|>lor|Z2|>I 

i.e„ 

(Zt.Z])€  U1. 

Thus  the  only  possible  zeros  of  P  in  U2  are  for  Z,  -0  or 
Z;»0.  But  by  standard  results  in  the  theory  of  several 
complex  variables  [8],  if  the  zero-set  were  nonempty,  this 
would  imply  that  either  Z,  or  Z:  was  a  factor  of  P,  which 
is  impossible  by  our  choice  of  m  and  n.  Thus  P  has  no 
zeros  in  U2.  Finally,  on  T2 

|F(Z,.Z2)|*|Zr Z!P(l/ Zt,  l/Z2)| 

»(F(Z1.Z;)|-|/»(Zi,Z2)|. 

Q  is  defined  similarly  and  has  similar  properties.  Then 


clearly  has  the  required  properties. 

Conversely,  suppose  j  is  any  polynomial  for  which  there 
is  a  rational  function  /  without  poles  or  zeros  in  U 2  such 
that 

Lft-i. n 

then  ///  is  rational  and  analytic  in  U.  and 

U/f  )•!-!. 

Thus  by  Theorems  5.2.5  and  5.2.6  in  (2).  f/f**P/Q. 
where  P  and  Q  are  polynomials,  P  has  no  zeros  in  V2, 
and  Q  has  no  zeros  in  U2.  Then 

/-  Pf/Q 

gives  a  rational  (in  fact,  polynomial)  spectral  factorization 
of  f. 

The  Second  Criterion 

Our  second  set  of  conditions  for  the  existence  of  a 
rational  spectral  factorization  is  given  in  the  following. 

Theorem  3 

If  a  rational  function  /on  C2  admits  a  rational  spectral 
factorization,  then 

*■"  J 0 

is  a  constant  independent  of  y.  (0<^<2r).  for  all  in¬ 
tegers  m>  0  and  n>0. 

Again,  these  conditions  depend  only  on  the  amplitude 
response  of  /.  The  simplest  condition  is  that  for  m  —  1  and 
1:  it  can  be  easily  visualized  by  drawing  two  adjacent 
squares  in  the  6I9:  plane  on  which  the  amplitude  response 
is  defined  (the  frequency  response  extends  to  the  entire 
9,9:  plane  by  periodicity ).  and  drawing  lines  L ,  with  slope 
1  and  length  2r\  2  on  these  squares  (see  Fig.  1). 


Fig.  I. 


Then  the  condition  for  1.  n»l  can  be  restated  as: 
the  “average”  amplitude  of  the  function  /  along  the  line 
L,.  is  a  constant — that  is.  it  is  independent  of  the  particu¬ 
lar  line  L,  chosen.  (“Average"  here  is  to  he  understood  as 
the  geometric  mean  of  the  amplitude,  or  the  arithmetic 
mean  of  the  log-amplitude).  Alternatively,  we  may  say 
that  the  average  level  of  the  amplitude  over  any  line  of 
slope  I  and  of  length  2trV/2  is  independent  of  the  posi¬ 
tion  of  the  line  in  the  9,92  plane.  (For  example,  we  could 
vary  the  L,  over  the  dotted  square  in  the  direction  n.)  The 
conditions  for  higher  m  and  n  have  a  similar  interpreta¬ 
tion.  _with_a  slope  of  n/m  instead  of  1.  and  length 
2?Vm:  +  fl;  instead  of  2ir\/2  ;  clearly,  if  m  and  n  are 
not  relatively  prime,  the  corresponding  condition  is  super¬ 
fluous. 

This  theorem  then  gives  a  striking  limitation  on  the 
amplitude  response  of  a  rational  function  which  admits  a 
rational  spectral  factorization:  even  the  simplest  of  the 
conditions  (that  for  1)  implies  that  such  a  function 

cannot  accurately  approximate  an  amplitude  which  has 
large  variations  in  overall  level  in  the  direction  n  shown 
in  Fig.  1. 

Proof:  In  view  of  Theorem  2.  it  suffices  to  prove  this 
under  the  assumption  that  /  has  no  poles  or  zeros  in  U2. 
This  assumption  implies  that  /  has  a  holomorphic  loga¬ 
rithm  in  U2.  Then,  for  any  integers  m>0.  n>0.  and  any 
real  number  C, 

log  f{Zm,Z"e*  ) 

is  a  holomorphic  function  of  one  complex  variable  for 
Z  £  U.  Thus 

Re(log/(Zm.ZVc)) 

is  a  harmonic  function  in  U ,  and  so  by  the  mean-value 
property  of  harmonic  functions 

j;  J^Re( log /( Zm,Z  V* ) )  d9 *  Re(log/((T. OV*  )) 
i.e.. 

j-  f  *~Re( log '*''■*))  d9*  Re(log/(0.01) 

Ja 
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but 


and  so 


Re  log  w  ~  log  M,  forn-^0 


i  /% <0-  iog  |/(0. 0)| 
•«  •'0 


and  the  right-hand  side  is  independent  of  i£  (and,  inciden¬ 
tally,  of  m  and  n  a'so). 

An  obvious  question  which  arises  is  the  extent  to  which 
the  converses  of  these  results  hold.  In  fact,  the  converse  of 
Theorem  3  holds  for  polynomials,  and  modified  converses 
of  both  Theorems  1  and  3  hold  even  for  rational  func¬ 
tions.  The  modification  takes  the  following  form:  if  the 
Fourier  coefficients  of  log  |/*]  (where  /  is  a  rational 
function)  vanish  for  mn<0,  then  there  is  a  rational  func¬ 
tion  /  with  rational  spectral  factors  (equivalently,  a  ra¬ 
tional  function  without  poles  or  zeros  in  U2),  such  that 
|/*!-|/*|.  (A  similar  statement  holds  for  Theorem  3.) 
However,  the  proofs  of  these  converses  involve  some 
technical  analytic  details,  and  so  are  given  in  the  Appen¬ 
dix. 

The  modification  in  the  above  converses  lies,  of  course, 
in  the  fact  that  we  cannot  conclude  that  /  itself  has 
rational  spectral  factors:  thus  there  are  some  rational 
functions  which  can  be  stabilized  without  changing  the 
amplitude  response  but  to  which  the  classical  I -variable 
factorization  technique  cannot  be  applied.  A  simple  exam¬ 
ple  of  this  is  the  function 


/(Z,.Z,) 


Z,  +  Z-  —  1 

Z|  ~  Zj  —  z,z2  . 


Here.  |/*j  is  identically  1.  and  so  has  trivial  spectral 
factors:  but  /  itself  clearly  does  not. 

Although  the  converses  of  Theorems  1  and- 3  are  proved 
in  the  Appendix,  there  is  another  result  related  to  the 
converse  of  Theorem  3:  by  strengthening  the  condition  for 
I  alone,  we  can  get  a  stronger  converse  for  poly¬ 
nomials.  Before  we  state  this  converse,  however,  we  first 
give  a  stability  criterion  (used  in  the  proof  of  the  con¬ 
versed  which,  although  previously  known  [3],  has  not  ap¬ 
peared  in  the  engineering  literature.  Although  not  as  sharp 
(in  terms  of  dimension)  as  some  other  known  criteria  [4], 
it  has  two  advantages  which  make  it  useful  for  theoretical 
purposes:  first,  it  is  given  in  terms  of  a  one-parameter 
family  of  discs  without  the  lower  dimensional  test  in  [5]; 
and  second,  unlike  r  t  other  stability  tests,  which  con¬ 
clude  the  non  vanish.  _  of  a  polynomial  on  U2  from  its 
nonvanishing  on  some  subset  of  U2  which  contains  T2, 
this  test  allows  the  polynomial  to  vanish  at  some  points  in 
T:.  but  concludes  only  that  the  polynomial  does  not 
vanish  on  L'2.  Tne  criterion  is  the  following. 


Theorem  4 

Suppose  a  polynomial / has  no  zeros  in  the  set 


This  is  proved  in  a  much  more  advanced  context  in  (3]: 
however,  it  can  also  be  easily  proved  by  applying  one  of 
the  criteria  in  [4]  to  the  polydiscs 

£?«  {(Z,.Z,)S  <r:!|Z,i  <r.jZ;|  <r}.  for  0<r  <  I. 

The  hypotheses  impK  that  /  has  no  zeros  on  the  dis¬ 
tinguished  boundary  L'2.  for  0<r<  1.  and  none  on  the  set 

{(Z^Zjjec^z.-Zjjn  £?. 

Thus  by  Theorem  5  in  [4],  /  has  no  zeros  in  U}.  for  any 
r<  1,  and  so  /  has  no  zeros  in  U2. 

We  can  now-  state  and  prove  the  partial  converse  to 
Theorem  3. 

Theorem  5 

If  /  is  a  polynomial  with  the  property  that 
2~:  J^”log  l/(e'V,,+<f>)|  de~ log  |/(0.0)|, 


for  0  <  C  <  2c 

then  /has  no  zeros  in  U2. 

Thus  we  strengthen  the  condition  for  m  =  I  and  1  in 
Theorem  3  by  specifying  that  the  constant  in  question  is 
to  be  log  |/(0.0)j:  it  then  follows  not  only  that  /  has 
rational  spectral  factors,  but  that  it  is  actuallv  zero-free  in 
U2. 

Proof:  By  Theorem  4.  it  suffices  to  prove  that  /  has  no 
zeros  in  the  set 


{(Z1.z;)e(.::jiZl;  =  iZ:|) 

but  this  set  is  the  union  of  the  open  discs 

{(Z„Z:)|Z;»^Z,.|ZI|<  l}.  for  0<  ii  <2r. 

We  therefore  wish  to  prove  that  /  has  no  zeros  in  any  of 
these  discs:  or  equivalently  that  the  function  /y  of  one 
variable  defined  by  fjZ)~f(Z.Ze-*)  has  no  zeros  in  the 
open  unit  disc.  Applying  Jensen's  formula  j6.  p.  299]  for 
the  unit  disc  to/.,  we  get 

r-  f  log  i/viV*)|  Jtf-log  U(0)i-2  log  |Z,i 

where  the  summation  is  over  all  the  zeros  (counted  wiih 
multiplicity)  of  /.  in  the  unit  disc.  Expressine  this  in  terms 
of/: 


•r—  f"iog  i/(^’.^*,;,)j^*iog:/(o.o)!-2  log  !z,; 

4.Z  J o 

and  so  Z  log  |Z,|  -0. 

Since  for  any  Z,  in  the  open  unit  disc  log  1 Z, j  <  0.  the 
conclusion  follows.  (It  is  dear  from  the  proof  that  we 
always  have 


{iz!.z;)ei1j.zl;«:z:|} 

then  f  has  no  zeros  in  l'2. 


1 


2- 


f'~i og  ;/(«►*.<>'«**-’)!  d8  >  log  !/(0.0)|. 
•^0 
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It  follows  from  this  that  in  fact  the  apparently  weaker 
condition 

T"7 /27"l0S  *1  d02m'°i  l/(0.0)| 

4s-*  •'o  •'o  , 

is  sufficient  to  guarantee  that  /  is  zero-free  in  U2.  See  [2. 
p.  73].) 

Stable  1IR  Filters  and  Minimum-Phase  FIR  Filters 

The  very  close  relationship  of  spectral  factorization  to 
the  nonvanishing  of  polynomials  in  U2,  and  thereby  to 
stable  IIR  filters  (via  the  denominator  polynomial)  and 
minimum-phase  filters  (via  the  numerator  polynomial)  is 
already  clear  from  the  previous  sections.  The  force  of 
Theorem  2  is  that  purely  from  the  point  of  view  of 
amplitude  response,  transfer  functions  having  rational 
spectral  factors  are  equivalent  lo  those  whithout  poles  or 
zeros  in  U2.  Thus  the  restrictions  on  amplitude  response 
in  Theorems  I  and  3  apply  to  the  denominator  poly¬ 
nomial  of  any  stable  IIR  filter;  the  contribution  of  the 
denominator  polynomial  to  the  overall  amplitude  re¬ 
sponse  of  the  filter  (in  the  case  of  an  all-pole  filter,  the 
entire  amplitude  response)  must  satisfy  the  restrictions 
imposed  by  Theorems  1  and  3.  We  have,  therefore, 
identified  the  properties  of  the  amplitude  response  which 
make  it  impossible  to  stabilize  a  filter;  if  the  original 
amplitude  response  has  large  overall  variation  in  the 
"wrong"  directions,  attempting  to  find  a  stable  filter 
which  closely  matches  this  response  is  futile.  Close  match¬ 
ing  of  the  amplitude  forces  instability.  This  has  already 
been  shown  by  example  by  Bose  [9]  and  Woods  [10]:  we 
now  see  that  it  is  the  variations  in  the  amplitude  response 
in  the  “wrong”  directions  in  their  examples  which 
account  for  their  behavior. 

It  is  also  of  interest  to  note  that,  in  the  Shanks  proce¬ 
dure  of  minimizing 

/  f\fg-\\2d0td02 

over  all  polynomials  /  of  given  degree  (where  g  is  the 
original  polynomial),  if  the  allowable  f  s  were  restricted  to 
those  which  have  polynomial  spectral  factorizations,  the 
procedure  would  yield  a  polynomial  devoid  of  zeros  in 
C  *  It  does  not  appear  that  this  observation  can  be  used 
as  the  basis  for  a  workable  stabilization  method,  however, 
since  the  condition  that  /have  polynomial  spectral  factors 
is  intractably  nonlinear  in  the  coefficients  of  /:  and  fur¬ 
ther.  in  many  cases  this  procedure  would  yield  an  /  which 
was  only  marginally  stable.  For  the  same  reasons,  restrict¬ 
ing  oneself  throughout  the  design  procedure  to  polynomi¬ 
als  which  satisfy  the  condition  in  Theorem  5  does  not 
appear  to  be  a  feasible  method  of  ensuring  stability. 

Examples  and  Comments 

An  example  of  the  behavior  of  those  polynomials  not 
possessing  polynomial  spectral  factors  has  already  ap¬ 
peared  in  the  iiierature.  although  in  a  different  context; 


we  repeat  this  example  here: 

A(Z,,Z2) m  1  -0.75Z,  +0.9Z,:  +  1.5Z2-  l.2Z.Z2 

+  UZ,2Zr+  1.2Z22+0.9Z,ZJ:+0.5ZfZJJ. 

This  polynomial  was  studied  in  [7];  the  associated  Shanks 
polynomial  was  found  to  be  stable  but  to  have  a  substan¬ 
tially  different  amplitude  response  from  that  of  A  (for 
more  details,  see  [7]).  The  fact  that  A  does  not  have 
polynomial  spectral  factors  was  established  by  checking 
the  condition  in  Theorem  3,  for  m  »  n  -  1  and  ^  *0,  ?r, 

w’ith  the  following  results  (correct  to  nine  decimals): 

4-  [lT log  \A(e*.e*)\  dO  —  0.696570700 

4z  flT  log  \A(eJt.  e/<*+c,)|  <0-1.134686936. 

•'0 

As  an  example  of  a  polynomial  with  rational  spectral 
factors,  we  have 

S(Z„Z2)-  1  +2.25Zl  +  2.25Z2+0.5Z,J  +  0.5Zf 

-6.5Z.Zj -  Z2Z2 -  Z.Z?  - 4Z2Z2. 

This  factors  into  (I  +  0.25Z,  + 0.25  Z2+ 0.5  Z,Z2)(1  +2Z, 
+  2Zj  —  8Z.Zj),  where  the  first  factor  has  no  zeros  in  U2, 
the  second  has  none  in  K2;  reversing  the  second  factor 
gives  a  polynomial  without  zeros  in  U2: 

B  ( Z.,  Z2)  -  ( I  +  0.25  Z.  +  0.25  Z,  +  0.5  Z .  Z2) 

•(-8  +  2Z:  +  2Z.  +  Z.Zj) 

-  -8  -2Z,Zj+0.5Zf +0.5Z;:+  1.25Z,:Z; 

+  I.25ZlZJI+0.5ZfZf 

and  B  has  the  same  amplitude  response  as  B. 

In  order  to  gain  some  idea  of  the  stringency  of  the 
conditions  in  Theorem  3.  let  us  consider  the  case  of  an 
ideal  bandpass  filter.  By  an  ideal  bandpass  filter  we  mean 
a  filter  whose  amplitude  response  is  equal  to  1  on  some 
subset.  A.  of  the  square  0 <  0.  <2r.  0<  92<2~.  and  equal 
to  1  on  the  complement  of  A  (of  course  this  specifica¬ 
tion  continues  over  the  whole  plane  by  periodicity).  This 
of  course  is  not  the  amplitude  response  of  any  rational 
function,  but  in  practice  for  certain  shapes  of  the  set  .4. 
one  may  wish  to  approximate  such  a  response  by  a 
rational  function.  One  easily  sees  that  up  to  a  scale  factor, 
the  averages  in  Theorem  3  are  in  this  case  merely  the 
fraction 

length  of  the  line  L,  lying  in  the  complement  of  A 
total  length  of  the  line  L , 

It  is  easily  seen  from  this  that  there  are  very  few  passband 
shapes  of  practical  interest  which  satisfy  even  the  first  of 
these  conditions  (where  n  — 1  and  /n—  1);  in  other  words, 
there  are  very  few  which  can  be  accurately  approximated 
by  transfer  functions  having  rational  spectral  factors. 
(This  is  not  to  imply  that  one  would  in  practice  be 
restricted  to  such  filters:  the  above  discussion  is  meant 
solely  as  an  indication  of  the  severity  of  the  restncbons  on 
the  amplitude  of  such  filters.) 


mumuy:  spectral  factorization 


Finally,  we  remark  that  there  does  not  seem  to  be  any 
difficulty  in  extending  the  results  in  this  paper  to  higher 
dimensions,  and  to  multidimensional  systems  other  than 
digital  filters. 


Appendix 

The  Converses  to  Theorems  1  and  3 

These  converses  involve  some  technical  ideas  and  re¬ 
sults  from  [2];  the  most  important  ideas  are  those  of  inner 
function  [2,  p.  105],  outer  function  [2,  p.  72],  Poisson 
integral  [2,  p.  17],  and  the  classes  H(U2)  [2,  p.  44]  and 
•V.(t/2)  [2,  p.  44], 

We  will  also  use  the  following  notation  from  [2]  (here  / 
is  an  analytic  function  on  U): 

i)  /‘(e'*1,*''*1)  *  Iim  f(re',\reJ*i) 

r— I  * 

will  denote  the  radial  limit  of  /  (this  is  clearly  consistent 
with  our  previous  use  of  /*). 

ii)  For  H>-(w„H'1)e  r2,/„(Z)  will  denote  the  one-vari¬ 
able  function  defined  by 

/w(Z)*/(Zw,.Zw2). 

iii)  If  p  is  a  function  defined  on  T2  which  is  absolutely 
integrable  there: 

$(m.n)  *  f  f  exp  (-  jm9 ,  -jn0-,)6(8v9^)  d9-  d0[ 
4ir  •'o  ■'o 

will  denote  the  Fourier  coefficients  of  p. 

iv)  For  any  function  p  on  T2 

-4 

4 IT*  ■'0  Jo 
will  be  denoted  by 

f  $  dm  or  I  6(h  )  dm(n). 
jt-  Jr- 

We  will  first  prove  the  converse  to  Theorem  1,  and 
from  this  derive  the  converse  to  Theorem  3.  First  of  all. 
however,  we  need  the  following  lemma  (which  is  given  as 
a  problem  in  [2]) 

Lemma  A1 

If  6  is  a  real-valued  function  defined  on  T 2  such  that 

0<=L'(T2) 

i.e.. 

j  Jp|  dm  <  x 
and 

p(  m.  n ) —  0.  for  mn  <  0 
then  there  is  an  outer  function  /  on  U2  such  that 
/>[<.] -log  ]/! 

(where  P\  ]  denotes  "Poisson  integral  of). 


Proof:  Let 


and  let 


p(m./i),  (m,n)*(  0,0) 

1/2  $(m,n),  (m,/i)-(0,0) 


«(z„z,)-  2  2  a^zrz;. 

m  — 0  /l*0 

This  series  clearly  converges  uniformly  on  compact 
subsets  of  U2,  and  so  defines  an  analytic  function  there. 
If  we  let  /-ef,  then  /  is  analytic  in  U2,  and 

!og  I/I-  2  2  am.rrrZ  exp  Old,  A-jnBf) 

m—Q  /r  — 0 


+  22  am»  r\rl  exp  ( -jm0x -jrtif) 

m«0 

-  2  2  i>(m,n)r\^r'Hl  expijmO^jnfij) 

/rj  •  -  9C  1  * 

-*[♦]  [2,p.  17]. 

Next  we  prove  that  /  is  outer:  we  have  (for  0<r<  1) 

f  >Og*  |/(r*  )|  dm( h  )  <  f  ;Iog!/(/’H-)|[  dm(w) 

JT ■  JT- 

-jr\P[<>](™)\dm(w) 

<  f  |p(v)i  dm(w) 

JT- 

[2.  Theorem  2.1.3(c)] 

<  x 


and  so/ e  ,V(f/2). 

Now  f*  exists  almost  everywhere  on  T2  [2.  Theorem 
3.3.5]  and  log  |/*|-6  almost  everywhere  on  T2  [2.  Theo¬ 
rem  2.2.1);  thus  log  ]/;-  P[log  I/*]]  and  so /SA’.(L'2)  [1 
Theorem  3.3.5].  and  log  j/(0)!  - /r:log  |/*(n’)l  dm( w).  Thus 
/is  outer.  Q.E.D. 

We  can  now  prove  the  converse  to  Theorem  1. 

Theorem  A  2 

Let  /(Z,.Z:)  be  a  rational  function  (=0).  and  let 
6 -log  |/*J. 

If  p(m./?)-0.  for  mn<0.  then  there  is  a  rational  function 
g  without  poles  or  zeros  in  U:  such  that  |g*|  — j/*j. 

Proof:  By  Lemma  Al.  there  is  an  outer  function  g  such 
that 

log  !$!-  P[log  !/|]. 

This  implies 

log  ig#; -log  \f*\ 

almost  every  where  or,  T:  Therefore,  for  almost  all  h  S  T2 
log  lg.*(Z  ),’  —  log  \f2(Z  )| 

for  almost  all  Z  S  T  [2.  Lemma  3.3.2):  and  g.  is  outer  for 
almost  all  h  e  T2  [2.  Lemma  4.4.4], 
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For  any  such  w.  let  Z,,-"-Za  denote  the  poles,  and 
•  •  ,Zm  denote  the  zeros  of  /„,(Z)  in  U,  and  let 


/„(Z)-  n 


z-z, 


n 


Z.Z-1 


L(Z). 


*- 1  ZAZ-1  4  —  -H  Z_Z* 

Then  Jw  has  no  poles  or  zeros  in  U  and  is  rational;  hence, 
/._  is  outer.  Since  g„  is  outer,  we  have  fw/gw  is  outer.  Also 
l/JI-iyJI,  and  so  |£|*|gJl,  for  almost  all  ZGT.  Thus 
/»/&..  is  inner.  But  a  function  which  is  both  outer  and 
inner  is  a  constant  of  modulus  1,  and  so 


g„.  “  e^fw.  for  some  real  xp. 

Thus  g„  is  rational  for  almost  all  w£  T1,  and  so  g.  is 
rational  for  all  #•££,  where  EQT2  is  a  compact  set  of 
positive  measure  (by  the  inner  regularity  of  the  measure). 
It  follows  by  [2,  Theorem  5.2.2]  thatg  is  rational  (since  the 
vanishing  of  a  polynomial  P  on  a  set  of  positive  measure 
in  T2  would  imply 


we  get 

1  f  n/mS,*  2* 

—  Jo  J  exp  (jlm02  -jlnO  ,)<*>(  0, ,  02)  d02  dO ,  -  0 

and  since  the  integrand  is  periodic  in  0,  and  02 

r2e  r2w 

I  I  exp  (jlm92-jln9l)$(9l,92)  d92d9lm0 
J0  •'0 

and  so 

£(  —  In,  Im)  *  0,  for  all /^O,  m>0.  and/i>0 

that  is 

<>(m,/i)»0,  forallm.n,  withmn<0. 

The  result  now  follows  from  Theorem  A2.  Q.E.D. 

Finally,  we  note  that  if  /  in  Tneorem  A3  is  a  poly¬ 
nomial,  then  the  converse  in  Theorem  2  implies  that  /  has 
polynomial  spectral  factors.  Thus  we  have  the  full  con¬ 
verse  of  Theorem  3  for  polynomials. 


log  |7>-|«z-'(r2) 

and  so  P  =0.) 

Thus  g  is  a  rational  function  without  poles  or  zeros  in 
U2,  and 

|  g*| » |/*|,  almost  everywhere  in  T2 
and  so.  since  g  and /are  both  rational 

|g*|-|.H.  on  T2.  Q.E.D. 

We  next  prove  the  converse  to  Theorem  3. 

Theorem  .43 

Let  /(Z,,Z2)  be  a  rational  function  (sO)  and  let 

*-login 

If  \ /2irjl'r<p(m9.n9  +  \L)  d9  is  a  constant  independent  of  ip 
for  each  pair  (m.n),  with  m> 0  and  «>0,  then  there  is  a 
rational  function  g  without  poles  or  zeros  in  U2  such  that 

Proof:  Let  m>0.  n  >0.  and  let  l¥* 0  be  an  integer.  Then 
Then 


References 

[1J  M.  P.  Ekstrom  and  J.  W.  Woods.  "Two-dimensional  spectral 
factorization  with  application  to  recursive  digital  filtering."  IEEE 
Trans.  Acousl.,  Speech,  and  Signal  Processing,  vol.  ASSh-24,  pp. 
115-128.  Apr.  1976. 

[2]  W.  Rudin,  Function  Theory  in  Polvdtscs.  New  York:  Benjamin, 
1969. 

[3]  W.  Stoll.  Holomorphic  Functions  of  Finite  Order  in  Several  Compies 
Variables  (CBMS  Regional  Conference  Series  in  Mathematics). 
Providence,  RI:AMS.  1973. 

[4]  R.  A.  DeCarlo,  J.  Murray,  and  R.  Sacks.  “Multivariable  Nvouist 
theory."  to  be  published. 

[5]  T.  S.  Huang,  “Stability  of  two-dimensional  recursive  filters.”  IEEE 
Trans.  Audio  Electroacoust..  vol.  AU-20.  pp.  158-163.  June  1972. 

[6]  W.  Rudin,  Real  and  Complex  Analysis.  New  York:  McGraw-Hill. 
1966. 

[7]  J.  L.  Shanks,  S.  Treitel,  and  J.  H  Justice.  "Stability  and  synthesis 
of  two-dimensional  recursive  filters.”  IEEE  Trans  '  Audio  Electro- 
acoust.,  vol.  AU-20.  pp.  115-128,  June  1972. 

[8]  R.  C.  Gunning,  and  H.  Rossi.  Analytic  Functions  of  Several  Com¬ 
plex  Variables.  Englewood  Cliffs.  NJ:  Prentice-Hall.  1965. 

[9]  N.  K.  Bose.  “Problems  in  stabilization  of  multidimensional  filters 
via  Hilbert  transform, “  IEEE  Trans  Geosci.  Electron.,  vol  GE-12. 
pp.  146-147,  Oct.  1974. 

[10]  J.'W.  Woods,  IEEE  Trans  Geosci.  Electron..  (Corresp  ),  vol.  GE-12. 
p.  104.  July  1974 


* 


f  V"*  f  ~<p(m9.n9  +  t)  d9  dC -0 
J  o  Jo 

nler 

eJ"n'i>(m9.  n9  -ft/)  d9  d£=0. 


•2r  rlir 

■'0  -'o 
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9m— 8 
m 
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.  Abstract 

It  Is  shown  that  time-varying  systems  may  be  modelled  In  terms 
of  semidirect  product  algebras,  and  that  the  known  theory  of  in¬ 
duced  representations  for  these  algebras  in  many  cases  enables 
one  to  give  sharp  criteria  for  stability  of  such  systems. 
Finally,  an  example  is  given  in  which  a  previously  known  result 
is  proved  using  these  techniques. 


1.  INTRODUCTION 

In  this  paper  the  theory  of  semidirect 
product  algebras  is  proposed  as  an  ap¬ 
proach  to  the  problem  of  the  stability  of 
r. iiffe-varylng  systems.  We  first  describe 
these  algeoras,  and  then  show  how  a  large 
oiara  of  c  Ino-va.'yiiut  systems  may  b\  mod¬ 
elled  in  terms  of  them.  In  the  third 
paragraph,  an  approach  to  the  problem  of 
stability  in  general  Banach-algebraic 
terms  is  described,  and  in  the  fourth,  the 
special  structure  and  known  properties -of 
semidirect. product  algebras  are  shown  to 
be  particularly  suited  to  this  approach. 
Finally,  the  theory  is  applied  to  a  par¬ 
ticular  situation  to  derive  some  results 
very  similar  to  those  proved  by  Davls[l,2] 
by  other  methods. 

2.  SEMIDIRECT  PRODUCT  ALGEBRAS 
Given  a  locally  compact  Abelian  group  u 
and  a  separable  C*  -  algebra  A  with  iden- 

*  This  research  supported  in  part  by  the  Join 
Tech  University  under  ONR  Contract  76-C-1136. 


tlty  on  which  the  group  G  acts  as  a  group 
of  isometric*-automorphisms ,  we  define  the 
set  L^G.A)  to  be  the  Banach  space  of 
Bochner-lntegrable  functions 

f:  G  -  A 

(To  be  more  accurate,  we  assume  that 
•T;  n  -  » 

is  a  continuous  homomorphism  of  G  into  the 
set  of  isometric*  -  automorphisms  of  A 
with  the  strong  topology;  we  normally 
suppress  T  and  consider. the  elements  of  G 
to  be  automorphisms  of  A).  ' 

The  product  on  Li(G,A)  is  defined  by 
(fh)(x)  * Jgf (y ) [y (h( x-y ) ) ]  du(y) 

Vx.ycG,  f,h*L1(G,A)  (1) 

(The  Abelian  group  G  is  written  addltlvely 
and  d u  denotes  the  Haar  measure  on  G). 

The  involution  on  L^CG.A)  is  defined  by 

Services  Electronics  Program  at  Texas 
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f«(x)  *  x(f(-x»). 

With  these  definitions,  L^CG.A)  becomes  a 
Banach*-algebra ,  called  the  twisted  group 
algebra  on  G  with  values  in  A.  The  envel¬ 
oping  C*-algebra  of  L'(G,A),C3],  will  be 
denoted  by  C*(G,A).  The  above  is  a  simp¬ 
lified  version  of  a  more  general  construc¬ 
tion  given  in  [  U] . 

Finally,  we  note  that  an  alternative  ap¬ 
proach  to  senidlrecf  product  algebras  is 
to  define  them  as  algebras  of  sections  of 
Banach*-algcbralc  bundles[5 ,6] .  We  will 
not  use  this  concept  here,  but  we  note 
(in  connection  with  Stability  and  Prim¬ 
itive  Ideas)  that  Banach*-algebraic  bund¬ 
les  were  introduced  as  a  powerful  tool  for 
calculating  certain  representations  -  the 
induced  representations  -  of  Banach*- 
alcebras . 

3.  ALGEBRAS  OF  TIME-VARYING  SYSTEMS  AS 
SEMIDIR5CT  PRODUCTS. 

In  order  to  simplify  the  exposition  ana  to 
ensure  that  our  algebras  L  (G,A)  have  an 
idoeti tv .  '-.'ill  limit  ourselves  f— nm  n«u 

on  to  discrete-time  systems,  that  is,  we 
will  assume  Gi2.  It  should  be  clear, 
however,  that  the  corresponding  theory- 
will  hold  good  for  continuous-time  systems. 

We  take  the  algebra  A  to  be  any  complete 
algebra  of  bounded  functions  or.  2  which 
contains  the  identity.  Multiplication  is 
defined  pointwise,  and- the  norm  is  taken 
to  be  the  suc-norm.  Since  every  such  al¬ 
gebra  is  a  commutative  C*algebra  with 
identity,  it  is  iscnorchic  to  C(X)  for 
some  compact  Hausdorff  space  X. 

The  group  2  acts  on  A  in  the  obvious  way: 

(g(a))(n)  *  a(n-g)  for  acA,  n,gc2. 


Now  the  operator  describing  the  input- 
output  mapping  of  a  scalar-input,  scalar- 
output  system  with  coefficients  in  A  may 
be  written  formally  as 

1  a  ( . )g;  o  cA,  g  c  2  (2) 

gcZb  * 

This  transforms  input  seauences  x(n)  t.o 
output  sequences  y(n)  by 

y (m)  •  [  a  (m)x(m-g), 

gcZ  E 

which  has  the  obvious  physical  interpreta¬ 
tion  as  a  sum  of  delayed  inputs  with  tine- 
varying  weights.  It  is  clear  that  the 
formal  expressions  in  (2)  are  Just  func¬ 
tions  from  2  to  A,  and  so  can  be  regarded 
as  elements-  of  L1(2,  A);  the  only  non- 
obvi'ous  formal  property  is  that  the  cas¬ 
cade  connection  of  two  such  systems  is 
represented  by  the  product  (1). 

This  follows  formally  from: 

(  [  bh(-)h)  'I  a  ( . )g)  x(n) 
hcZ  n  gcZ  g 

2  (  E  b.(.)h)  (E  a  (n)  x(n-g)) 

hcZ  R  gcZ  E 

s  V  V  h  f  n)  (  m— h )  y  f  n-c-h^ 

hcZ  gcZ  n  * 

I  I  b„(h)  h(a.  .  (n) )  x(n-k) 
hcZ  kcZ  n  k_h 

-  E  {  I  C  bh  h(ak_h)](n)>  x(n-k). 
kcZ  he  Z 

The  analytic  details  of ‘the  above  formal 
manipulations  are  easily  checked  arid  will 
net  be  considered  here. 

«.  STABILITY  AND  PRIMITIVE  IDEALS 

T'ne  utility  of  transform  methods  in  deter¬ 
mining  the  stability  nV  time-invariant 
linear  systems  stems  from  the  fact  that  in 
a  commutative  u.mach  algebra,  an  element 
is  invertible  if  and  only  if  its  Gelfand 
transform  '.s.  It  is  therefore  natural  to 
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seek  an  analogue  of  this  In  the  non- 
commutatlve  case,  and  the  most  obvious 
analogue  Is  the  representation  of  a 
Banach  algebra  as  a  subdirect  product 
over  the  set  of  primitive  ideals.  (See 
[7],  where  also  the  terminology  used  in 
this  section  is  defined).  Here  we  have 
the  property  that  an  element  is  invert¬ 
ible  if  and  only  if  its  image  in  the 
quotient  algebra  by  every  primitive 
ideal  is  invertible.  Thus,  in  order  to 
determine  lnvertibillty  (and  hence  sta¬ 
bility)  it  is  necessary  to  determine  all 
primitive  ideals  in  our  algebra.  Since 
a  primitive  ideal  is  by  definition  the 
kernel  of  an  irreducible  representation, 
this  problem  is  related  to  the  problem  of 
determining  the  irreducible  representa¬ 
tions  of  an  algebra.  It  is  precisely 
these  problems  which  have  received  the 
most  attention  in  the  literature  on  semi- 
direct  products.  In  particular,  the 
primitive  ideals  are  studied  in  [8,9,10, 
11),  especially  with  reference  to  a  con¬ 
jecture  made  in  [8]  that  all  primitive 
Ideals  of  a  semidlrect  product  algebra 
can  be  found  as  kernels  of  representa¬ 
tions  iiiui.wul  i rote  '.hit  isotropy  outgroups 
of  the  group  action  of  G  on  X,  where  the 
original  C*-algebra  A  is  given  as  C(X). 

In  this  connection,  it  is  Interesting  to 
note,  as  mentioned  before,  that  semi- 
direct  product  algebras  can  be  realized  * 
as  algebras  of  sections  of  Sanach*- 
algebraic  bundles,  which  were  introduced 
precisely  for  the  purpose  of  extending 
the  idea  of  induced  representation  to 
Banach  algebras.  There  is  tnus  some  hope 
that  the  problem  of  determining  all  prim¬ 
itive  ideals  can  be  solved  by  known  tech¬ 
niques.  Hathcr  than  give  a  theoretical 
discussion  of  these  concepts,  however, 
which  would  take  us  too  far  afield  and 
occupy  too  much  tpace,  we  will  simply 
give  an  example  of  their  application  to 


derive  a  previously  known  result  in  the 
next  section. 

5.  EXAMPLE 

Let  A  *  (f:Z— $|  f(2n)*f(0)  and  f(2n+l) 
s  f(l),v7'n}  which  gives  us  the  class  of 
periodically  time-varying  systems  whose 
period  is  twice  the  sampling  period. 

A  «»C(X)  ,  where  X  =  (0,1). 

Z  acts  on  X  by 

g  *  identity,  g  even 
g(0)  =  1  and  g(l)  =  0,g  odd. 

Thus  X  is  the  unique  orbit,  and  H  = ( 2n | 
neZ  )  is  the  unique  isotropy  subgroup. 

Now,  using  the  results  of  CIO),  we  see  that 
every  primitive  ideal  is  obtained  by  in¬ 
ducing  from  the  irreducible  representations 
of  H.  Since  H  «*  Z  ,  the  latter  are  Just 
the  usual  frequencies,  whose  representing 
measures  are  given  by 

m 

(ejnw)R?-“  for  -»«  w  <*. 

We  will  calculate  the  induced  representa¬ 
tions  following  C8).  An  induced  measure 
on  Z  x  X  is  given  by 

“  :  f  ' —  T.  f(2n,o)  eJnw 
ns-* 

(Here  we  are  making  the  natural  identifi¬ 
cation  of 

f:  Z— C(X)  with  f:  Z  xX - (?) . 


( f *g) ( n ,o )  *  l  ?(-2m,o)K(n-2m,o) 
m=-» 

*  [  ?( -2m-l , 1 )g(n-2m-l , 1)  (n  even) 


(f»g)(n,o)  -  I  f(-2m,0)g(n-2m,l) 


ibis  pac*  is  ti-ji  5'j.u.m  i 

WjM  >  i IV  000  m 


(n  odd) 


♦  l  ?(-2m+l ,1 )g(n-2n-l ,o) 
m*-« 

from  which  It  follows  that 

u(f»f)  *  I  I  f'2m,o)eJmw|2 

♦  I  l  f(2m-l,l)eJmw|2 
1  m 

and  so  we  have  a  two-dimensional  represen¬ 
tation 

S  h—  ( £g( 2m,o  )e"  mw  ,  [g( 2m-l  ,1  )e^ mw ) 
m  in 

on  which  the  action  of  f  can  be  represent¬ 
ed  by  left  multiplication  bylthe  matrix 

l  f(2m,o)eJmw  ,  [f(2m+l,o)eJmw 
m  m 

l  f(2m-l,l)eJmw  ,  l  f(2m,l)eJmw 
m  m 

l 

If  the  system  is  finite,  this  will  be  a 
rational  matrix  in  Z-eJ  ;  and  the  original 
ooerator  will  be  invertible  if  this  matrix 
is  nonsingular  on  the  unit  circle.  A  hom- 
otopy-type  argument  then  easily  proves 
that  if  the  original  operator  was  causal 
and  bounded,  it  will  have  a  bounded,  caus¬ 
al  inverse  if  the  determinant  of  the  mat¬ 
rix  has  zero  winding  number  about  the 
origin. 

In  this  connection,  we  note  that  the 
theory  of  serr.idirert  product?  ha?  been 
eveloped  entirely  fpr  ^-algebras,  so  that 
if  we  wish  to  apply  it  to  causal  systems, 
we  must  either  redevelop  the  entire  theory 
for  aigeoras  without  involution,  or  devel¬ 
op  the  topological  (and  other)  criteria 
which  give  conditions  for  a  causal,  in¬ 
vertible  operator  to  have  a  causal  inverse. 
The  latter  course  seems  preferable. 

Finally,  we  note  that  the  above  example  is 
a  special  case  of  the  results  of  Davis[l). 
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Optimal  estimation  in  signal-dependent  noise 
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Optimal  estimators  are  derived  for  a  data  of  signal-dependent  noise  processes.  Such  processes  are 
of  interest  in  optics  because  phenomena,  such  as  film  grain  noise,  a.")  often  modeled  in  this  manner. 
This  paper  demonstrates  that  when  one  ignores  the  presence  of  signal-dependent  noise  and  instead 
assumes  only  signal-independent  noise  models,  the  resulting  estimators  may  pay  a  severe  penalty 
in  performance.  This  "mismatch’*  problem  ia  explored,  with  the  results  of  Monte  Carlo  simulations 
of  the  performances  of  both  optimum  and  mismatched  estimators  being  presented.  The  Cramer-Rao 
lower  bounds  on  the  mean-square  estimation  errors  for  unbiased  estimators  are  evaluated  and  com¬ 
pared  with  the  lower  bound*  derived  for  the  signal-independent  noise  case.  Overall,  the  results 
indicate  that  improved  performance  will,  in  most  cases,  offset  the  increased  complexity  inherent  in 
estimators  designed  for  the  signal-dependent  noise  model. 


INTRODUCTION 

In  contrast  to  the  signal-independent  additive  noise  models 
traditionally  encountered  in  statistical  communication 
theory,1-  many  physical  noise  processes  are  inherently  sig¬ 
nal-dependent.  Common  examples  from  optical  processing 
include  film-grain  noise,  encountered  in  image  processing,  and 
photoelectronic  shot  noise,  which  is  sometimes  dominant 
when  imaging  at  low-light  levels  with  phoioemissive  detec¬ 
tors.3-4  An  example  of  a  nonoptical  noise  source  which  is  ef¬ 
fectively  signal  dependent  is  magnetic  tape  recording  noise.3 
A  study  of  these  particular  examples  indicates  that  studies  of 
optimum  estimation  in  signal-dependent  noise  processes 
would  have  applications  to  a  broad  class  of  signal-processing 
problems  in  modern  optics  and  in  other  fields. 

To  date,  the  majority  of  the  work  dealing  with  signal-de¬ 
pendent  noise  has  been  concentrated  on  rather  specialized 
examples  and  applications.  Using  a  Poisson  point-process 
noise  model,  Goodman  and  Belsher*  have  considered  the 
restoration  of  atmospherically  degraded  images  using  linear 
minimum  mean-square  error  filters.  Walkup  and  Choens4 
modified  the  familiar  Wiener  filter  for  various  additive, 
Gaussian  signal-dependent  noise  models,  and  Naderi7  has 
done  considerable  additional  work  on  this  problem.  Addi¬ 
tionally,  Hunt*  has  derived  a  nonlinear  maximum  a  posteriori 
(MAP)  estimator,  based  on  a  different  model  than  the  one 
considered  here,  which  can  accommodate  both  signal-de¬ 


pendent  and  signal-independent  noise  cases,  and  they  have 
applied  this  MAP  estimator  to  restoring  noise-degraded  im¬ 
ages.  For  such  applications,  and  in  the  special  case  where  the 
images  of  interest  exhibit  extremely  low  contrasts,  conven¬ 
tional  restoration  techniques  perform  rather  poorly.  Thus, 
heuristic  algorithms,  such  as  the  so-called  Vnoise  cheating” 
algorithm  for  film-grain  noise  suppression,9  have  been  de¬ 
veloped.  Other  algorithms,  which  explicitly  include  the  signal 
dependence  of  the  noise,  as  well  as  incorporating  pertinent 
properties  of  the  human  visual  system,  have  also  been  inves- 
tigated.7•,0•,, 

The  purpose  of  this  paper,  then,  is  twofold.  First,  several 
fundamental  properties  of  signal -dependent  noise  are  inves¬ 
tigated  in  order  to  better  understand  when  consideration  of 
signal -dependence  is  warranted  and  when  it  can  be  ignored. 
To  this  end,  the  mean-square  estimation  error  is  first  con¬ 
sidered  for  both  the  signal-dependent  and  signal-independent 
cases.  In  addition,  the  mean-square  estimation  error  for  a 
mismatched  case  is  evaluated.  The  mismatch  case  considered 
is  one  in  which  the  signal-dependent  measurement  model  is 
valid  but  is  ignored  for  purposes  of  simplification.  Secondly, 
optimal  estimators  are  derived  for  several  cases  of  both  sig¬ 
nal-dependent  and  signal-independent  models.  The  Cra¬ 
mer-Rao  lower  bound  on  mean  square  estimation  error  is  also 
determined,  in  order  to  find  the  lowest  error  possible  for  both 
signal-dependent  and  signal-independent  estimators.  The 
results  of  Monte  Carlo  simulations  of  the  performance  of  the 
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various  optimal  estimators  previously  derived  are  presented 
for  several  values  of  the  model  parameters  and  for  various 
prior  signal  probability  densities. 

PROBLEM  STATEMENT 

To  motivate  the  investigation  of  signal -dependent  noise 
processes,  it  is  necessary  first  to  define  the  models  to  be  used. 
The  signal-dependent  measurement  model  to  be  used  is  given 
by 

r  »  s  +  kf[s)n i  +  nj,  (1) 

where  and  n2  are  signal-independent  random  noise  pro¬ 
cesses;  s  is  the  underlying  signal  to  be  estimated  which  is  as¬ 
sumed  to  have  probability  density  p(s);  nj,  n 2,  and  a  are  as¬ 
sumed  to  be  mutually  statistically  independent;  f(s)  is  any 
function  of  the  signal;  k  is  a  scalar  constant;  and  r  is  the  noisy 
measurement.  The  signal-dependent  noise  term  in  Eq.  (1) 
is,  of  course,  the  term  kfls'in  j.  It  is  often  physically  reason¬ 
able  to  assume  that  both  n.  j  and  rt;  are  zero  mean  and  have 
unimodal  probability  densities.  Further,  note  that  substi¬ 
tution  of  k  =  0  in  Eq.  (11  yields 

r  *  s  +  no,  (2) 

which  is  just  the  familiar  textbook  additive,  signal-indepen¬ 
dent  noise  model.1-2  In  both  Eq.  (1)  and  Eq.  (2),  the  argu¬ 
ments  of  all  of  the  variables  have  been  dropped  for  simplifi¬ 
cation.  It  should  be  remembered  that  these  arguments  may 
depend  on  time,  position,  or  both. 

It  will  be  shown  repeatedly  that  the  model  of  Eq.  (2)  yields 
far  simpler  estimators  than  does  Eq.  (11.  as  would  be  expected. 
The  following  example  serves  to  illustrate  why  it  may  prove 
worthwhile  to  employ  the  more  complex  estimators  resulting 
from  Eq.  (1). 

When  the  observations  are  actually  of  the  type  given  by  Eq. 

1 2).  it  can  be  shown2  that  simply  using  the  received  value  as 
the  estimate  results  in  a  minimum-variance  unbiased  esti¬ 
mate.  i.e.. 

§  m  r,  (3) 

where  the  circumflex  denotes  the  estimate.  The  average  error 
is  then  given  by 

E\s  -  si  *  E\r  —  s|  *  £|n2|  *  0.  <41 

The  estimator  is  said  to  be  unbiased  since  the  mean  error  is 
zero.  A  measure  of  the  performance  of  this  estimator,  con¬ 
ditioned  on  the  signal  value,  is  given  by  the  conditional 
mean-square  error,  and  is  found  to  be 

£|(s  -  s)2|sl  *  £|n||  =  crS.  (5) 

which  is  simply  the  variance  of  the  additive  noise  process  n2. 
This  estimator  is  obviously  simple  from  an  implementation 
point  of  view. 

With  this  in  mind,  consider  a  case  in  which  the  observations 
are  actually  of  the  type  given  by  the  signal -dependent  model 
of  Eq.  ( 1>.  For  ease  of  implementation  it  is  decided  to  use  the 
estimate  given  in  Eq.  (3),  which  was  designed  for  the  signal- 
independent  noise  process.  This  represents  a  mismatched 
situation,  where  an  estimator  based  upon  an  incorrect  mea¬ 
surement  model  (corresponding  to  ignoring  the  signal-de- 
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pendency)  is  used.  Once  again,  the  average  estimation  error 
is  zero,  due  to  nj  and  nm  being  adorned  zero  mean  and  to  the 
assumed  mutual  statistical  independence  of  fij,  n2.  .tnd  s. 

However,  assuming  £  *  r,  the  mean-square  estimation  error 
for  this  mismatched  case  is  given  by 

£|(i  -  s)3|s|  *  k £ll/(*)]2|  +  <r|,  (6) 

For  convex  f(s),  £  0  for  all  s,  Jensen’s  inequality13 

slates  that  £j/(s)|  £  /|£|s|l,  where  £!•!  denotes  the  expected 
value.  This  inequality  may  be  used  to  find  a  lower  bound  for 
the  mean-square  estimation  error  for  the  mismatched  case. 
Thus,  recalling  Eqs.  (5)  and  (6), 

a\  <  *24|/[£(s)]|3  +  4  <  *2<rf  Ell/U)]2!  +  4  (?) 

Note  that  this  gives  a  lower  bound  (the  middle  term)  on  the 
mean-square  estimation  error  of  the  mismatched  estimator, 
and  that  this  bound  contains  a  function  of  the  signal’s  mean. 
The  left-most  term  of  Eq.  (7)  is  the  mean-square  estimation 
error  given  by  Eq.  (5).  Note  that  the  mismatched  mean- 
square  estimation  error  is  in  general  greater  than  the  error  for 
the  same  estimator  when  used  in  the  presence  of  signal-in¬ 
dependent  noise.  We  next  consider  an  illustration  of  the 
significance  of  Eq.  (7). 

A  commonly  used  model  in  image  processing  when  the  ob¬ 
served  quantity  is  the  photographic  density  is  given  by  Eq.  (1) 
with  f(s)  given  by  s?.4-10  From  Eq.  (6),  then,  the  mismatched 
mean-square  estimation  error  becomes 

£!(s  -  s)2|s!  «  k- a'iE\s-P\  +  4.  (81 

where  k  is  a  scanning  constant  relating  the  scanning  aperture 
area  to  the  mean  area  of  a  film  grain.  A  typical  value  of  p  used 
for  characterizing  photographic  film-grain  noise  is  p  =  1/2. 
though  p  =  1/3  has  also  been  used.4-10  Thus.  Eq.  (8)  becomes 
(for  p  *  1/2) 

£!(s  -  s)2|sl  *  k 2ir?£!.1!  1  +  4-  (8* 

which  is  greater  than  the  variance  of  Eq.  (5)  by  the  addition 
of  a  term  which  is  proportional  to  the  signal  mean.  Note  that 
in  the  particular  case  of  p  m  1/2.  the  equality  holds  between 
the  last  two  terms  in  Eq.  (7).  but  that  for  general  p  this  is  not 
the  case.  Here,  the  iower  bound  on  the  mean-square  esti¬ 
mation  error  given  by  Eq.  (7)  becomes 

£!(s  -  s)2|s!  £  fc24(£(s)|2p  +  4-  <10» 

The  lower  bound  given  by  Eq.  (10)  may  be  visualized  with 
the  aid  of  Figs.  1  and  2.  for  various  values  of  k.p.  and  £($>.  In 
all  cases,  the  plane  upon  which  the  surfaces  rest  is  not  the  zero 
plane,  but  rather  represents  a  height  of  4-  the  left-most  term 
of  Eq.  (7 1.  which  results  when  the  estimator  of  Eq.  1 31  is 
properly  matched  [to  the  signal-independent  noise  process 
of  Eq.  (2)].  In  Fig.  l.p  is  fixed  at  a  value  of  1/2.  o~  and  <r2  are 
set  equal  i"  1  rur  illustration,  with  k  and  Ets  1  being  varied.  In 
Fig.  2.  k  is  fixed  (at  k  *  1/2)  and  p  is  varied.  It  should  be 
noted  that,  for  film-grain-noise  applications,  common  values 
of  k  are  in  the  range  of  from  about  0.3  to  about  0.7.  These 
figures  illustrate  the  marked  deviation  from  the  variance 
achieved  by  a  properly  matched  signal-independent  estimator. 
Also,  it  should  be  remembered  that  these  surfaces  represent 
lower  bounds  on  the  mean-square  estimation  error  of  the 
mismatched  estimator,  and  there  is  no  guarantee,  in  general, 
that  even  this  measure  of  performance  can  be  achieved. 
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'  FIG.  1.  Mean*  square  estimation-error  lower  bound  lor  the  mismatched 
case,  p  ”  1/2. 

Thus,  optimal  estimators  based  on  the  proper  noise  model  are 
needed.  These  estimators  are  derived  in  the  following  sec¬ 
tions. 

MAP  ESTIMATION 

An  appropriate  optimal  estimate  when  the  signal  is  random 
and  its  probability  density  function  is  known  a  priori  is  the 
maximum  a  posteriori  probability  (MAP)  estimate.2  This 
estimate,  JmaPi  »  defined  to  be  that  value  of  s  which  maxi¬ 
mizes  the  a  posteriori  density  p  (s  |  r).  In  other  words,  given 


batch*  above  plane  •!  — ■ 


FKL  2.  Mean  square  estimation-error  lower  bound  tor  the  mismatched 
caae.  k  »  1/2. 


the  observation  r,  the  signal  value  /map  maximizes  the 
probability  of  that  value  of  r  having  been  received.  Maxi¬ 
mizing  p(s|r)  is  equivalent  to  maximizing  p(r|*)p(s),  or  al¬ 
ternately  the  logarithm  of  this  product  This  follows  from  the 
facts  that  (i) 


p(s|r) -p(r|*)p(*)/p(r),  (11) 

(ii)  the  denominator  is  not  a  function  of  s,  and  (iii)  because 
monotonic  transformations  (such  as  the  logarithm)  preserve 
maxima  and  minima. 


As  an  example  of  the  calculation  of  a  MAP  estimate,  assume 
that  nt  and  n2  are  both  zero  mean,  normally  distributed 
random  variables  having  variances  of  and  of,  respectively.  In 
this  case,  the  conditional  probability  density  p(r|s)  is  also 
normal,  with  a  mean  of  s  and  variance  o(i),  given  by 

u(s)  »  k2of[ /(*)!*  +  *1  (12) 


It  can  then  be  shown  that  the  MAP  estimate  is  a  solution  of 
the  equation 

(r  -  /map)*  o' (/map/  +  2(r  -  / mapM/map) 

-  o'(/map)o(/map)  +  2[o(Zmap)1*  —  In  P</ map)  “  0. 

o»map 

(13) 

where  the  prime  denotes  the  partial  derivative  with  respect 
to  /map- 


For  the  class  of  situations  where  f{s)  m  sp\  and  assuming  s 
is  distributed  normally  with  mean  u*  and  variance  of,  Eq.  (13), 
the  MAP  equation  becomes 


(2 A*o?(p  -1) 

+  (2r*2o?(l  -  2 p)  +  /&p 


/map  +  [2pk2o?(r2  -  <ri)]s&>‘ 


-(****$ 

(2ofr  +  -  [  2p*<eJ]fiKi!  +  n fc, 

OjL4.4 

- ^p/SKp-O.  (14) 


The  MAP  estimate,  /map> is  a  solution  of  Eq.  (14).  For  the 
specific  case  where  p  »  1/2,  Eq.  (14)  reduces  to  the  cubic 
equation 


*  (***»- ♦  aM)  /U 

+  —  r1)  -  2<r{r  -  “  0.  (15) 

Substitution  of  k  ■  0  into  Eq.  (14)  or  Eq.  (15)  yields  the  MAP 
estimate  for  the  signal-independent  noise  case  of  Eq.  (2), 
namely 


/map  ■ 


e*  + vi 


tb- 


ilS) 
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FIG.  3.  MAP  estimator  structures  lor  the  signal-inoepenoent  noise  pro¬ 
cess. 

Comparison  of  Eqs.  (14)  and  (16)  demonstrates  the  much 
greater  complexity  of  the  estimator  structure  when  signal- 
dependent  noise  processes  are  taken  into  account. 

This  comparison  can  also  be  seen  graphically.  Fig.  3  rep¬ 
resents  Eq.  (16)  plotted  for  various  parameter  values.  The 
ratio  ffs/< »;  can  approach  zero  in  either  of  two  ways:  either  the 
noise  variance  is  zero  or  the  signal  variance  is  infinite.  In  the 
former  case  there  is  no  noise,  so  s  »  r;  in  the  latter,  the  MAP 
estimate  becomes  a  ML  estimate,  s  *  r.  as  discussed  in  the 
next  section.  The  other  extreme  is  for  the  ratio  o\le;  to  in¬ 
crease  without  bound.  Here  the  noise  variance  becomes  in¬ 
finite  or  the  signal  variance  becomes  zero,  and  hence  the  best 
estimate  is  the  lassumed  known)  signal  mean.  n,. 

For  rhe  signal-dependent  noise  case.  Fig  4  shows  quite 
similar  estimator  structures.  These  curves  represent  the 
solution  of  Eq.  (lot  plotted  for  various  parameter  values.  The 
must  apparent  difference  from  Fig.  3  is  the  nonlinear  nature 
of  the  curves  in  Fig.  4.  Note,  however,  that  the  ratio  <rs/n»,,2 
is  considered  while  cr*  is  fixed  for  illustration.  This  corre¬ 
sponds  to  various  degrees  of  dominance  by  either  the  signal- 
independent  noise  term  or  the  signal-dependent  noise  term 
of  Eq.  (1).  Recall  that  in  both  figures,  ni  and  no  were  con¬ 
sidered  to  be  Gaussian  random  variables. 


FIG.  4  MAP  estimator  structures  tor  the  signal-dependent  noise  pro¬ 
cess. 


A  shortcoming  of  the  Gaussian  density  a?  a  model  for  pis ) 
is  that  the  probability  of  a  negative  value  of  s  is  nonzero,  re¬ 
gardless  of  £(.«},  and  often  this  can  be  physically  impos-thie 
(for  example,  when  s  represents  photographic  d^nsityi. 
However,  one  common  probability  density  of  interest  for  some 
classes-of  images  is  the  Rayleigh  density,11  i.e.. 

pis )sr7ieyp[-^]*  s-° •  ,17) 

which  has  a  positivity  constraint. 


Substitution  into  Eq.  (13),  the  MAP  equation,  with  f{s)  *  sp 
as  before,  yields 

(2Jfe Vr(p  -  1)  -  J'stff  +  2*Mr(l  -  2 p)  fjffii 

-  «MAP  +  2 \pk-g]  (^r-  ~  al  +  —*J  <t?AP 

+  2o5r.<MAe  “  2h~o *(p  —  1)  s'tfAP  + 


-^—':map?  =*  0.  (18) 


This  equation  is  quite  similar  to  Eq.(14)  with  u,  m  0.  but  each 
term  is  greater  in  Eq.  ( 1 3)  by  degree  one.  Thus,  when  p  =  1/2. 
the  MAP  estimate  .<map  is  a  solut  ion  of  the  quarlic 


+ 


2 £* 

n 

<T~ 


k*<r* )  i^AP  ~  ["2df(r2  +  3<rj)  +  2r<r*j  s.map 

/ 


-  2<ri  *  0.  (19) 


As  before,  substitution  of  It  =  0  into  Eq.  (19)  then  yields  the 
MAP  estimate  for  the  signal-independent  noise  case,  given 
by 


*  MAP 


a2  \4o-tTz(<r2  +  -rs)  +  r 
- ~r  +■ - = - ;  - 

r-  +•  tr~  2ie-  +  its) 


2„*n/z 


(20) 


Again  this  is  less  complex  than  I  he  MAP  equation  of  Eq.  (18), 
but  it  is  not  as  simple  as  the  solution  for  the  signal-indepen¬ 
dent  noise  model  with  a  normal  probability  density  for  s. 


Another  probability  density  with  a  positivity  constraint  is 
the  folded  normal,  that  is.  the  absolute  value  of  a  normally 
distributed  random  variable.  Its  probability  density  function 
is  given  by 


ptsi  *  [p.v(s>  +  p.v(-s)|u(*).  (21) 


where  u(s)  is  the  unit  step  function  and  p\<s)  is  (he  norma' 
probability  density  function.  After  much  manipulation,  it 
may  be  shown  that  the  MAP  estimate  for  this  case  is  given  by 
the  value  of  s‘ua/>  which  satisfies 


/  2ui£\1AP\  U  ~  s*MaP  .  r  "  ^  MAP 

xpr^r)L  — — 

x  /  T  +  (r  ~  /  MAP  It1'  (^MAP>\  I  _  I"  4.  +  ‘MAP 

V  t’(^MAP)  'J  l 


r  -  *  MAP 
2t’(is(i,p* 


X  (1  + 


(f  ~  FMAP>c'(*MAP>\  1 


t’(iMAP) 


0.  (22) 


where  t’(s)  is  given  by  Eq.  (12).  and  the  prime  again  denotes 
differentiation  with  respect  to  s.  To  obtain  the  MAP  estimate 
for  the  signal-independent  measurement  model  of  Eq.  (2).  fc 


J.  Opt.  Soc.  Am..  Vol.  68.  No.  12.  December  1978 


Froehlich  et  <1! 


■  0  is  substituted  into  Eq.  (22)  to  obtain 

Neither  of  these  equations  lend  themselves  to  straightforward 
solution;  however,  it  is  once  again  obvious  that  the  signal* 
independent  noise  model  yields  a  much  simpler  solution. 


MAXIMUM-LIKELIHOOD  ESTIMATION 

Another  commonly  used  estimator  is  the  maximum-like¬ 
lihood  (ML)  estimator.2  The  ML  estimate  is  employed  when 
no  prior  knowledge  of  the  signal  is  assumed,  and  it  is  found 
by  maximizing  p(r|s)  over  s.  In  other  words  find  a  value  of 
s,  such  that  given  s,  the  most  probable  observation  r  which 
would  result  is  the  value  observed.  Using  the  signal-depen¬ 
dent  measurement  model  of  Eq.  (2),  and  still  assuming  nt  and 
r» 2  are  zero  mean  normal  random  variables  with  variances 
and  ai,  respectively,  the  ML  estimate,  3ml.  is  a  solution  of  the 
equation 

(r  -  <ml)V«ml)  +  2(r  -  f  ml)^(Jml) 

“  *  0,  (24) 

where  u(Jml)  and  u'(Iml)  are  as  define  1  previously.  Again, 
considering  the  special  case  f(s)  ■  s?,  the  ML  equation  be-' 
comes 

2k-o\(p  -  1)  sgfir1  +  2rfcM(l  -  2 p)  Ifo. 

+  2p*M(r2  -  4)  ijSfiT1  -  2 02  sml 

-  2 pk*„\  +  2<r|r  »  0.  (25) 

This  equation  is  at  worst  no  more  complex  than  the  MAP 
equation  (14).  In  fact,  for  p  *  1/2,  Eq.  (25)  becomes  the 
quadratic  equation 

k2<rf  I  k  +  (2<r3  +  *M)<ml 

+  k2<rf(<rf  -  r2)  -  2o|r  »  0.  (26) 


This  has  as  its  positive  root  the  ML  estimate 


The  ML  estimate  for  the  signal-independent  model  of  Eq.  (2) 
is  found  by  letting  k  *  0  in  any  of  Eqs.  (24)— (26),  and  is  given 
by 


Iml  *  r .  (28) 


Note  that  this  is  the  minimum  variance  unbiased  estimate 
used  in  Eq.  (3)  for  the  mismatched  example,  for  which  we 
earlier  found  the  mean-square  estimation  error. 


A  graphical  comparison  of  Eq.  (27)  and  Eq.  (28)  for  various 
parameter  values  is  shown  in  Fig.  5.  In  this  figure,  the  dashed 
line  represents  Eq.  (28),  while  the  solid  lines  represent  Eq. 
(27).  Note  especially  the  estimator  structure  when  the  sig¬ 
nal-independent  noise  term  is  comparable  to  or  greater  than 
the  signal  -dependent  noise  term.  In  this  case,  the  estimator 
takes  the  form  Iml  ■  r  —  b,  where  6  is  an  approximately 
constant  bias  determined  by  the  ratio  <r$/(**t)2.  Note  that 


FIG.  5.  ML  estimator  structures. 

as  this  ratio  increases  without  bound,  b  asymptotically  ap¬ 
proaches  0.5. 

Another  point  worthy  of  note  is  the  similarity  between  Eq. 
(13),  the  general  MAP  equation,  and  the  ML  equation,  Eq. 
(24).  These  expressions  differ  only  by  an  additional  term  in 
Eq.  (13),  and  it  is  this  term  which  contains  all  of  the  prior 
knowledge  about  s.  This  term  vanishes  when  lnp(s),  and 
hence  p(s),  is  constant  In  other  words,  if  s  is  distributed 
uniformly  over  all  of  its  space  of  definition  (a  worst  case),  then 
knowledge  of  its  value  in  no  way  affects  the  maximum  of 
p(s|r)  *  p(r|s)p(s)»cp(/-|s).  Thus,  the  ML  estimator  can 
be  viewed  as  a  worst  case  of  the  MAP  estimator.  Because  the 
MAP  estimator  embodies  a  priori  information  about  s  that 
is  not  present  in  the  formation  of  the  ML  estimate,  it  would 
seem  reasonable  to  assume  that  the  MAP  estimate  would 
exhibit  a  smaller  mean-square  estimation  error  than  the  ML 
estimate.  It  will  be  seen  that  this  is  indeed  the  case.  In  the 
next  section,  bounds  on  the  variances  of  these  estimates  will 
be  found. 


CRAMER-RAO  LOWER  BOUNDS 

A  well-known  lower  bound  on  the  variance  of  any  unbiased 
estimate  for  a  fixed  but  unknown  s  is  the  Cramer-Rao  error 
bound.2  Given  the  conditional  density  p(r |s),  the  Cramir- 
Rao  bound  is  given  by 


For  r\\  and  n2  normal  with  zero  mean,  Eq.  (29)  reduces  to 


var  ll  -  s|s]  £ 


2(t*(s)|2 

2v(s)  +  [t/'(*)]2  ’ 


where  u(s)  and  o'(s)  are  as  given  by  Eq.  (12).  For  the  sig¬ 
nal-independent  noise  model,  which  is  the  result  of  letting  k 
■  0  in  Eq.  (30),  the  Cramfcr-Rao  bound  is  given  by 


var  |i  -  s|s]  2  <rf,  (31) 


which  is  the  variance  actually  achieved  by  the  ML  estimate 
of  Eq.  (28)  for  the  signal-independent  noise  case.  When 
equality  holds  in  Eq.  (29),  the  estimate  I  is  said  to  be  efficient2 
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FIG.  6  C^am#f-Pao  lower  bound  on  estimation  error  for  the  signal-de- 
pendent  measurement  model,  k  *  t/2. 


M  SCI 
*  «««tf 


(A)  (•> 

FIG.  8.  Mean-square  estimation  error  for  the  MAP  estimator,  as  a  function 
of  the  signal  mean  £{s).  with  (a)  ka,  *  1  and  (b)  fra,  =  2.  The  solid  line 
is  the  signal-dependent  estimator  error  and  the  dashed  line  is  the  mis¬ 
matched  estimator  error,  and  <rj  “  1. 


Thus,  the  signal-independent  ML  estimate  is  efficient  when 
the  measurement  is  actually  of  the  form  given  by  Eq.  (2). 

For/(s)  *  )p,  as  before,  Eq.  (30)  becomes 

k  +  2*^*0  +  at  . 
var  [S  kl'-y-P  +  +  aV  (32) 

which  for  p  »  1/2  reduces  to 


var  [s  -  s|s]  i- 


{k-a\  +  k*a*/$Js  +  <rf 


Although  it  is  not  obvious  by  inspection,  the  bound  given 
by  Eq.  (32)  may  actually  be  smaller  than  the  bound  given  by 
Eq.  131).  In  other  words,  there  are  potentially  cases  where 
the  estimators  designed  for  the  signal-dependent  measure¬ 
ment  model  may  actually  out  perform  (in  a  mean-square- 
estimation-error  sense)  the  estimators  designed  for  the  sig¬ 
nal-independent  measurement  model.  To  better  illustrate 


FIG.  7  Cramer -Rao  lower  bound  on  estimation  error  for  the  signal-de- 

penoent  measurement  model,  p  ■  1/2. 


this.  Eq.  (32)  is  plotted  in  Figs.  6  and  7.  In  the  first  of  these 
k  is  fixed  at  1/2,  p?  and  a\  at  one,  and  s  and  p  are  varied.  As 
in  Figs.  1  and  2,  the  plane  upon  which  the  surface  rests  is  not 
the  zero  plane,  but  rather  is  the  Cramer-Rao  lower  bound 
given  by  Eq.  (31),  namely  pj.  In  Fig.  7,  p  is  fixed  at  1/2  and 
kat  is  allowed  to  vary.  Now  it  is  worth  noting  that  in  all  of  the 
previous  equations,  when  k  *  0,  k  and  0\  always  appear  to¬ 
gether.  Thus  varying  ka\  is  tantamount  to  fixing  either  one 
and  varying  the  other.  Note  that  in  Fig.  7;  for  certain  values 
of  It  Pi  and  s.  the  Cramer-Rao  bound  of  Eq.  (32)  dips  below  the 
Cramer-Rao  bound  of  Eq.  (31).  that  is,  it  dips  below  the  plane 
ps.  This  is,  of  course,  the  region  mentioned  above,  where  the 
inclusion  of  signal-dependence  in  the  measurement  model 
may  potentially  result  in  improved  estimator  performance. 
The  values  of  s  and  ko\  which  result  in  this  region  are  given 

by 

0<s  <  (ps/2)[1  -2/(fcpt)1l,  (34) 

where  ka\  must  then  satisfy 

fep  i>v"2.  (35) 

Recall  that  these  equations  are  derived  for  the  p  *  1/2 
case. 

To  get  a  feeling  for  the  actual  mean-square  estimation  error 
achieved  by  the  estimators  derived  above.  Monte  Carlo  sim¬ 
ulations  were  performed,  with  the  results  presented  in  the  next 
section. 

MONTE  CARLO  SIMULATIONS 

The  performance  of  each  of  the  estimators  derived  in  the 
previous  sections  was  evaluated  by  Monte  Carlo  simulations 
to  determine  the  mean-square  estimation  error.  The  results 
for  each  of  the  various  signal  probability  densities  were  so 
similar  that  only  one  case  is  presented.  The  Gaussian  case 
was  chosen  since,  for  the  MAP  estimate,  it  represents  the 
minimum  achievable  mean-square  estimation  error  isee  Ap¬ 
pendix).  Figure  8  shows  the  mean-square  estimation  error 
(MSEE)  of  the  MAP  estimate  plotted  as  a  function  of  the 
signal  mean  Els).  In  Fig.  3(a).  If  pi  *  1.  while  in  Fig.  8(h).  ko\ 
*  2.  The  solid  line  is  the  MSEE  for  the  MAP  estimator  of  Eo. 
(15)  and  the  dashed  line  is  the  MSEE  for  the  mismatched  case, 
that  is.  for  the  MAP  estimate  of  Eq.  ( 16)  when  applied  to  the 
signal-dependent  measurement.  Inclusion  of  signal  depen- 


J.  Opt.  Soc.  Am..  Vol.  68.  No.  12.  December  1978 


150 


Froehlich  et  ai 


I  *  3  Ell) 

IM 


3 


Bn 


FIG.  9.  Mean-square  estimation  error  far  the  ML  estimator,  as  a  function 
of  the  signal  mean  £fs).  with  (a)  kg,  »  1  and  (b)  kg,  m  2.  The  solid  line 
is  the  signal-dependent  estimator  error  and  the  dashed  line  is  the  mis¬ 
matched  estimator  error. 


dence  in  the  estimator  structure  is  seen  to  yield  estimates  of 
the  signal  which,  on  the  average,  have  smaller  error  than  would 
be  the  case  when  signal  dependence  is  ignored.  It  should  be 
noted  that  for  sufficiently  small  koi  and  small  signal  means 
the  signal-dependent  noise  term  is  negligible.  This  results 
in  the  estimates  for  the  mismatched  case  being  very  nearly 
equal  to  those  which  include  the  signal-dependence. 

Figure  9  presents  the  results  of  simulations  of  the  ML  es¬ 
timators.  As  before,  the  solid  line  represents  the  signal- 
dependent  estimator  MSF.E  and  the  dashed  line  represents 
the  MSEE  for  the  mismatched  case.  Once  again,  inclusion 
of  signal  dependence  is  seen  to  yield  better  estimates  on  the 
average.  Since  the  ML  estimates  include  no  prior  knowledge 
of  the  signal  statistics,  their  performance  is  markedly  inferior 
to  the  MAP  estimates,  but  as  previously  discussed,  the  ML 
estimate  represents  a  worst  case.  As  before,  for  small  k  a,  and 
small  E(s),  the  estimates  are  very  nearly  equal  regardless  of 
the  inclusion  of  signal  dependence  in  the  estimator  struc¬ 
ture. 

CONCLUSION 

Many  physical  processes  are  described  by  a  signal-depen- 
dent  observation  model.  It  has  been  shown  that,  in  such 
cases,  ignoring  the  signal  dependence  for  purposes  of  designing 
estimators  of  the  signal  may  result  in  severe  penalties  in  terms 
of  estimation  error.  Therefore,  optimal  estimators  which 
include  the  signal-dependent  structure  were  derived.  Spe¬ 
cifically,  these  were  ML  estimates,  which  include  no  prior 
knowledge  of  signal  statistics,  and  MAP  estimates,  which 
assume  prior  knowledge  of  the  signal  probability  density. 
The  latter  estimate  was  derived  for  the  Gaussian,  Rayleigh, 
and  folded  Gaussian  density  functions.  The  performance  of 
these  estimators  was  then  investigated  by  Monte  Carlo  sim¬ 
ulation.  As  expected,  inclusion  of  signal  dependence  in  the 
estimator  structure  resulted  in  improved  estimator  perfor¬ 
mance. 


FIG.  10.  The  uniform  cost 
function. 


It  should  be  noted  that  there  may  exist  suboptimal  esti¬ 
mators  which  prove  to  be  more  desirable  in  terms  of  imple¬ 
mentation  or  other  practical  considerations.  For  example, 
no  mention  has  been  made  of  how  to  obtain  the  signal  statis¬ 
tics  necessary  in  the  formulation  of  MAP  estimators.  How¬ 
ever,  the  purposes  of  this  paper  were  to  demonstrate  the 
potentially  severe  consequences  of  ignoring  signal  dependence 
and  to  derive  optimal  estimators  for  the  signal-dependent 
noise  case. 
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APPENDIX 

Bayesian  estimators  are  those  estimators  which  serve  to 
minimize  the  Bayes  risk,  where  the  Bayes  risk  is  the  expected 
cost  of  estimation  based  on  some  cost  function.  For  example, 
minimum  mean-square  error  is  achieved  when  the  cost  is 
proportional  to  the  square  of  the  estimation  error,  i.e.,  when 
the  cost  function  is  a  parabola.  The  MAP  estimator  is  a 
Bayesian  estimator  based  on  the  uniform  cost  function  shown 
in  Fig.  10. 2  The  cost  for  no  error  is  zero  (as  it  is  for  some  A 
region  about  no  error),  and  the  cost  of  any  other  error  is  uni¬ 
form  (all  errors  are  weighted  equally). 

It  can  be  shown2  that,  under  certain  conditions,  the  optimal 
Bayes  estimate  is  invariant  for  a  variety  of  cost  functions,  and 
is  equal  to  the  minimum  mean-square  error  estimate.  These 
conditions  are:  (i)  the  cost  function  is  convex,  (ii)  the  cost 
function  is  symmetrical,  (iii)  the  a  posteriori  probability 
density,  p(s|r),  is  symmetrical,  and  (iv)  lim,_.C(s)p(s|r)  » 
0,  where  C(s)  is  the  cost  function  with  argument  s.  Condition 
(iv)  is  simply  a  requirement  that  the  a  posteriori  density  goes 
to  zero  faster  than  the  cost  function  increases.  Viterbi14  has 
shown  that  the  uniform  cost  function  satisfies  these  condi¬ 
tions.  When  the  prior  signal  density,  p(s),  is  assumed 
Gaussian,  then  clearly  p(s|r)  is  symmetrical,  as  required  in 
condition  (iii).  Thus,  for  this  case  we  have  the  optimal  Bay¬ 
esian  estimate,  and  it  is  the  estimate  which  yields  the  mini¬ 
mum  mean-square  estimation  error. 
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INTRODUCTION 

Many  physical  noise  processes  are  signal-dependent.  One 
well  known  example  is  film-grain  noise  (1-3) . 

In  this  paper  optimal  estimators  for  images  in  signal- 
dependent  film-grain  noise  are  presented. 

THE  MODEL 

A  versatile  model  incorporating  both  signal-independent 
additive  noise  and  signal-dependent  noise  is  utilized. 

This  model  is  given  in  Eq.  (1), 

r  *  s  +  kffsJn^  +  n2,  (1) 

where  r  is  the  observed  photographic  density,  s  is  the 
original  uncorrupted  image  density,  k  is  the  scanning  con¬ 
stant,  f(s)  is  some  function  of  s,  and  n^  and  n2  are  signa. 

independent  noise  processes.  Thus,  the  middle  term  on  the 
right-hand  si^j  of  Eq.  (1)  is  the  signal-dependent  noise 
term. 

It  is  assumed  that  n^,n2,  and  s  are  mutually  statist! 

ally  independent.  To  apply  the  model  to  film-grain  noise 

proh  let  f(s)  =  s^,  where  p  is  usually  taken  to  be  1/ 

or  1/1  i  3) . 

In  tnis  paper,  we  let  p  »  1/2  and  we  assume  n^  and  n2 

are  zero  mean  Gaussian  random  variables,  with  variances 

c\  and  .'2,  respectively.  Further,  s  is  assumed  to  be  a 

Gaussian  random  variable  with  mean  u  and  variance  o^. 

s  s 

THE  ESTIMATOR  STRUCTURES 

The  maximum  likelihood  (ML)  estimate  is  found  by  maximizir 
p(r/s)  over  s  (3).  For  the  model  of  Eq.  (1),  the  estimatt 
is  found  to  be 
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which  results  when  the  signal-dependent  noise  term  of  Eq . 
(1)  is  omitted. 

The  maximum  a  posteriori  probability  (MAP)  estimate 
is  found  by  maximizing  p(s/r)  over  s  ( 3 ) . *  For  th®  model  o: 

Eq.  (1)  and  the  above  assumotions,  the  estimate  s  is 

map 

found  to  be  the  solution  of 
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Again,  omission  of  the  signal-dependent  noise  term  in 
.(1)  results  ir.  a  comparatively  simplified  estimate, 
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Because  this  MAP  estimate  includes  prior  information 
about  the  image,  it  should  give  superior  performance.  In 
fact,  under  the  aoove  assumptions  it  can  be  shown  that  the 
MAP  estimator  minimizes  the  mean  souare  estimation  error 
(3)  . 

RESULTS 


Figure  1  is  the  original  image  of  an  archer.  Figure  2  is 

the  noisy  image  generated  digitally  according  to  the  model 

of  Ec.  (1).  The  image  in  Fig.  3  is  the  estimate  found  by 

the  solution  of  the  MAP  equation,  Eq.  (4),-  with  v  and 

2  "  S 
c  taken  to  be  the  sample  mean  and  variance  of,  the  original 

i.f.age . 

One  factor  severely  affecting  estimator  performance  is 
violation  of  the  assumption  that  the  imaae  statistics  are 
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Gaussian.  For  a  discussion  of  this,  see  the  paper  by 
Froehlich  et.al.  (3) . 


Fig.  2.  The  noisy  Fi g.  3.  The  esti- 

image,  <3^*0. 4,  mate. 
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Abstract 

In  this  paper  ws  consider  a  class  of  patterns 
which  are  subject  to  the  action  of  a  group  of  trans- 
f ora* cions.  We  are  particularly  concerned  with  the 
existence  of  measurements  or  features  which  are  in¬ 
variant  with  respect  to  transformation.  A  concept 
of  relative  invariance  is  also  introduced  and  ex¬ 
plored  in  depth.  In  a  very  general  sense,  it  is 
shown  that  every  invariant  (and  relative  invariant) 
is  a  suitable  average  over  the  relevant  group  of 
transformations.  Finally,  invariant  means  of 
bounded  functions  are  used  to  explore  existence  of  * 
pattern  invariants.  Suggestions  for  further  re¬ 
search  are  also  given. 

Key  Words  and  Phrases;  Pattern,  group,  trans¬ 
formation,  feature,  invariant,  relative  invariant, 
group  average,  invariant  mean,  representation, 
linear  transformation. 

1.  Introduction 

The  importance  of  group  theory  as  a  tool  to  be 
exploited  in  modelling  a  variety  of  perceptual  phe¬ 
nomena  has  been  demonstrated  by  a  number  of  wri¬ 
ters2,10,11,16.  Although  the  influence  of  group 
theory  is  implicit  in  much  of  the  literature  on 

pattern  recognition1, 6.12,15,18,  rei4tiV#iy  few  in¬ 
stances  can  be  found  in  which  explicit  utilization 

of  group  theory  is  the  cencral  theme4 ’®*12. 
Without  exception,  group  theory  has  been  used  to 
effectively  model  some  aspect  or  festure  which  is 
invariant  under  transformation  and  to  exploit  this 
invariance  in  performing  the  recognition  function. 
However,  no  definitive  study  has  been  made  of 
transformational  invariance  and  no  general  model 
has  been  introduced  which  attempts  to  formalize  the 
concept  of  invariance  as  it  relates  to  pattern  re¬ 
cognition.  This  is  indeed  strange  in  view  of  the 
relatively  advanced  state  of  the  theory  of  invar- 

9  14  19 

lanes  within  group  theory  ’  ’ 

In  the  following  we  formulate  a  general  model 
in  which  many  problems  in  pactern  recognition  may 
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be  cast  in  a  natural  fashion.  We  discuss  represen¬ 
tations  of  patterns  as  functions  defined  on  a  group 
and  proceed  co  investigate  the  existence  of  invar¬ 
iant  functionals 

2.  The  Model 

Let  S3  denote  a  set  of  objects  called  patterns 
and  assume  that  C  is  a  group  of  transformations 
which  act  on  G  on  the  left.  For  tic  13  and  g  e  C  we 
denote  by  gu  the  image  of  w  under  the  transforma¬ 
tion  g.  Also,  for  ua  denote  their  product 

or  composition  by  gjg^.  Action  on  the  left  is  then 
given  by  the  identity 

(gjgj)"  *  gj^gjw).  (1) 

for  gj ,  jjCC  and 

We  now  assume  that  our  ability  to  "recognize" 
and/or  otherwise  "classify"  patterns  is  obtained 
via  measurements  performed  upon  individual  patterns. 
Such  measurements  can  take  values  of  a  quite  gen¬ 
eral  nature,  although  the  usual  situation  will  re¬ 
sult  in  a  vector  of  real  numbers.  Accordingly,  we 
define  a  measurement  function  to  be  a  mapping 
R:  13 -V,  where  V  is  a  suitable  set  of  permissible 
values.  We  shall  later  assume  that  V  is  a  real 
finite-dimensional  vector  space.  We  say  that  a 
measurement  function  R:  !3-V  is  Invariant  provided 
that  R(gu)  •  R(u)  for  all  u  c  £3  and  g  c  G.  Observe 
that  an  invariant  measurement  does  not  distinguish 
between  the  various  members  of  an  orbit 
[u]  ■  {gw|geG},  being  constant  on  each  such  orbit. 

More  generally,  we  say  that  a  measurement 
R:  C3-V  is  relatively  invariant  provided  that 
R(gu)  -  o(g)R(w).  Here  o  is  a  homomorphism  of  G 
into  a  group  of  transformations  on  V  and  is  called 
the  modulus  of  R.  As  a  matter  of  practice,  we  are 
interested  in  the  ease  in  which  V  is  a  finite  di¬ 
mensional  vector  space  and  o  is  a  representation  of 
G  in  the  group  CL(V)  of  Invertible  linear  transfor¬ 
mations  on  V.  Note  that  a  relative  invariant  not 
only  depends  upon  the  orbit  of  u  but  is  also  sensi¬ 
tive  to  "position"  within  the  orbit. 

In  applications  one  must  solve  simultaneous 

equations  R  (u)  »  R™  Involving  a  number  of  invar- 
o  a  m 

iants  { R  }  and  associated  actual  measurements  {  R  } 
a  a 

to  classify  the  orbit  of  u  and  then  solve  similar 
equations  Pg(g)R.(w)  “  R™  Involving  relative  Invar- 
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iants  to  determine  position  within  the  orbit.  Hence 
we  see  that  the  question  of  existence  of  invariants 
and  relative  invariants  become  of  paramount  impor¬ 
tance. 

3.  Representations 

In  order  to  pursue  the  question  of  existence 
of  invariants  we  find  che  need  of  considerably  more 
structure  than  we  have  assumed  at  this  point.  It 
is  somewhat  surprising  that  this  additional  struc¬ 
ture  can  be  imposed  on  che  transformation  group  and 
need  not  involve  restrictive  assumptions  about  the 
space  of  patterns.  Since  che  transformation  groups 
chat  are  typically  encountered  are  quite  rich  in 
structure,  we  find  ourselves  in  an  advantageous 
situation. 

Let  us  briefly  digress.  Suppose  chat  X  is  any 
set  and  that  Che  group  C  acts  on  X  on  che  left. 

Let  f:  X»Y  be  a  mapping  of  G  to  some  set  Y.  Then 
for  any  g  e  G  we  may  define  a  new  mapping  f :  X-Y 
given  by 

(gf)(x)  -  f(g-1x),  xeX.  (2) 

Note  chat  appearance  of  g  1,  rather  chan  g,  is  a 
convenience  which  makes  certain  formulae  more  na¬ 
tural  for  later  use.  .We  easily  verify  that 

*  (gLg2)f  (3) 

so  that  if  F  is  a  sec  of  functions  such  chat  gf  t F 
ror  all  f  eF,  then  (2)  defines  an  action  of  G  on 
the  left  of  F. 

Sow  let  R:  3  -V  be  a  given  measurement  func¬ 
tion.  For  each  ui  -3  we  may  define  a  function 

:  3»V  as  follows: 

JM  "  R(x  1u) ,  xeG.  (4) 

The  correspondence  r:  u—  thus  defines  a  mapping 
of  2  into  the  sec  F(G,V)  of  functions  from  G  co  V. 
Now  for  fixed  g  £  G  we  see  chat  for  all  x  tG, 

(gur)(x)  *  wT(g  ^x)  »  R((g  ^x)  *  R((x  ^g)w)  * 

R(x  1(gw))  ■  [(gui)r](x).  That  is, 

gw'  »  (g«)r,  (5) 

for  all  u>  e  N  and  g?C.  Equation  (5)  establishes 
the  desired  connection  beeween  our  paccerns  and  the 
V-valued  functions  on  G.  We  define  a  representa¬ 
tion  of  .1  on  V  co  be  a  map  r:  3-F(G,V)  which  sat¬ 
isfies  (5).  Such  a  representation  allows  a  con¬ 
crete  interpretation  of  paccerns  as  sulcable  func¬ 
tions  defined  on  the  group. 

We  have  the  following: 

Theorem  i.  The  representations  r  of  3  on  V 
correspond  one-co-one  to  the  measurement  functions 
R:  .'.-V.  Th«  cr.respondence  is  given  via  r— R  if 
and  only  l: 

.f(xl  •  R(x_1w),  xcC,  wt.I. 

Proof:  We  have  already  seen  Ch3t  each  mea¬ 
surement  R  defines  a  representation.  Now.  let 


u-u  be  a  representation  of  2  on  V  and  let  us  set 

R(u)  ■  5(1  ).  where  1.  Is  the  identitv  element  of 
g  G  r 

G.  We  must  show  that  u  as  defined  by  (4)  satis¬ 
fies  u>r  ■  u.  But  for  x  e  G  we  have  u(x)  ■ 

(x  ^5)(lg)  »  R(x  1u)  «  »r(x)  ,  from  which  u  ■  <jr, 
as  desired. 

Let  us  point  ouc  that  an  invariant  measurement 

is  characterized  by  the  condition  that  each  uir  is  a 
constant  function,  which  on  the  surface  seems  some¬ 
what  unlncereselng.  This  is  a  deceptive  simplifi¬ 
cation,  however,  as  will  be  apparent  later.  Simi¬ 
larly,  if  R  is  a  relative  Invariant  with  modulus  o, 

we  see  that  wT(x)  *  p(x  ^)R(w). 

Before  proceeding  co  pursue  the  existence  of 
invariants,  it  seems  appropriate  to  further  accent 
che  importance  of  relative  invariants  by  demonstra¬ 
ting  one  of  their  fundamental  properties.  Let 
R:  2-V  be  a  relative  invariant  with  modulus  B. 
Suppose  that  c 3  and  chat  R(u^)  •  R(w,).  Then 

for  any  g e  G  we  have  RCgWj)  -  pCgJRCwj)  -  o(g)R(w,) 
•  R  ( gcu  ^  3  *  Thus 

Rfw^)  ■  Rfu^)  implies  RfgUj)  »  R(gw,). 

Lt  is  somewhat  interesting  co  note  thac  the  condi¬ 
tion  above  is  a  complete  characterization  of  rela¬ 
tive  invariants  as  is  shown  in  the  following: 

Theorem  2.  In  order  that  R:  2— V  be  relatively 
invariant  it  is  necessary  and  sufficient  that 

R(uj)  *  R(u,)  implies  •’.(gv^)  ■  R(gw,) 
for  all  g  £  G. 

Proof:  Necessity  has  alreadv  been  shown. 
Conversely,  suppose  chat  R:  3-V  satisfies  the 
stated  condition.  We  must  construct  a  homomorphism 
of  C  into  che  group  Sym(V)  of  transformations  on  V. 
For  v  «  R(w)  e  V  and  geG,  let  us  define  o(g)v  ■ 
R(gw).  We  note  that  this  definition  does  not  de¬ 
pend  on  w  for  if  also  v  ■  R(j')  then  R(gu  ">  »  R(gu) 
by  che  property  of  R.  If  vtV  is  not  of  the  form 
v  «  R(w)  then  set  o(g)v  ■  v.  We  easily  verify  chat 
each  o(g)  e  Syra(V) .  Also,  c(g,)o(g2)v  « 

o(gj)o(g2)R(gi:)  »  o  (gj  )R(g,w)  -  R(gxg,w)  -  o  (gjg,)v 
in  case  v  «  R(u.)  and  o(g,)o(g„)v  •  v  »  ofg^Jv 
otherwise.  Thus,  o(g, )o(g,)  »  c(g,g,)  so  that  :  is 
indeed  a  homomorphism. 

Finally,  hv  definition  of  o(g)  we  see  chat 
R(gw)  •  o(g)R(u)  for  all  geG  and  o  £  .'.  so  that  R  is 
a  relative  invariant  with  o  as  modulus. 

4 .  Invariants  and  Relative  Invariants 

As  previously  stated,  we  may  impose  additional 
structure  by  invoking  restrictions  on  the  transfor¬ 
mation  group.  Henceforth,  we  assume  that  V  is  a 
real  (or  complex)  vector  space  of  finite  dimension 
and  that  G  is  a  locally  compact  topological  group. 


Such  a  group  admit s  a  left  Invariant  Integral , 
called  left  Haar  measure  and  the  Integration  theory 
for  such  groups  Is  well  established1  . 

The  fundamental  technique  for  construction  of 
invariants  will  be  the  computation  of  average  values 
over  the  entire  group  G.  This  technique  was  ex¬ 
ploited  by  Pitts  and  McCulloch16  In  their  classic 
work  on  the  perception  of  audio  and  visual  forms. 

It  appears  also  In  the  classical  theory  of  group 
19 

representations  and  Is  prevalent  In  modern  anal- 
9  14 

ysls  ’  Group  averaging  has  been  used  as  a  tool 
In  pattern  recognition  in  a  relatively  few  in¬ 
stances,  for  example  Implicitly  In  [1,12]  and  expli¬ 
citly  in  [5,7,8]. 

Mow,  let  u  denote  the  left  Haar  measure  of  G 
and  let  f:  5*V.  Ue  define  the  mean  value  of  f, 
provided  it  exists,  by 

«°  •  zk  V  “•  <S) 

where  K  is  a  compact  subset  of  G  and  the  limit  is 
taken  as  K  increases.  Mote  that  K  compact  implies 

that  g  is  compact  and  that  u(K)  •  u(g  ^K) .  This 


together  with  the  fact  that  /Rgfdu 
gcG,  shows  the  following: 


/  f  du. 


Lemma  1.  If  M(f)  exists  then  for  any  gcG, 
M(gf)  exists  and  M(f)  -  M(gf ) . 

We  denote  by  L(V),  or  simply  L,  the  set  of  all 
f:  G-»V  for  which  M(f)  exists.  We  have  the  fol¬ 
lowing: 

Lemma  2.  L(V)  is  a  linear  space  on  which  C 
acts  as  the  left  as  a  group  of  linear  transforma¬ 
tions.  Moreover,  M  Is  an  invariant  linear  trans¬ 
formation  of  L(V)  into  V. 

More  generally,  let  o  be  a  representation  of  G 
in  the  group  CL(V)  of  invertible  linear  transforma¬ 
tions  on  V.  Ue  form  the  weighted  average  of 
f:  G-»V,  provided  the  limit  exist,  as  follows: 

M  (f)  »  lim  — /  p(x)f  (x)du(x) ,  (7) 

0  KtG  U(K)  K 

where  K  is  compact,  as  above.  The  set  of  functions 

for  which  M  (f)  exists  will  be  denoted  by  L(V,p), 

0 

or  simply  L(p).  Mote  that  by  the  substitution 

y  »  g-1x  we  obtain  /Kp(x)gf (x)du(x)  »  /Rp(x)f (g  lx) 

du(x)  »  /  .  p(g)f(y)du(y)  *  p(g)/  .  o(y)f(y)du(y) . 

g  K  g  K 

It  follows  immediately  that 

M  (gf)  -  p(g)M  (f),  (8) 

0  0 

for  all  gtG,  fcL(V,p).  We  conclude: 

Theorem  1.  L(V,p)  is  a  linear  space  on  which 
G  acts  as  a  group  of  linear  transformations.  Also, 
Mq  is  a  linear  mapping  of  L(V,p)  into  V  which  is 

relatively  invariant  with  modulus  o. 


Let  us  define  o':  C-^GLCV)  by  o’(x)  ■  p(x-i)  • 

(p(x))  l.  We  have  p'(xy)  •  p’(y)p'(x),  so  that  p' 
is  a  dual  homomorphism.  For  f  c  L(V)  we  may  consi¬ 
der  the  product  p'f  given  by  (p'f)Cx)  •  p'(x)f(x), 
xcG.  Since  p(x)c'(x)  »  1^  we  see  that  p’f  eL(V.p) 

whenever  fcL(V).  Similarly,  fcL(V)  implies 
Pf  *L(V,p). 

We  evidently  then  can  use  p  and  o '  as  multi¬ 
pliers  to  pass  back  and  forth  between  L(V)  and 
L(V,p) .  Thus: 

Lemma  3:  The  map  f - p 'f  is  a  linear  isomor¬ 
phism  from  L(V)  onto  L(V,p).  Moreover,  M(f)  • 

M  (p’f)  for  all  ft  LOO. 

0 

Although  this  shows  thst  the  lintsr  structure 
of  LOO  and  L(V,p)  are  no  different,  it  is  impor¬ 
tant  to  observe  that  they  are  quite  different  with 
respect  to  the  action  of  C. 

We  may  now  state  sufficient  conditions  for  the 
existence  of  invariants  and/or  relative  invariants 
for  the  pattern  space  ft.  Quite  simply,  if  R:  ft  »  V 

is  such  that  each  ureL(V,p)  then  we  obtain  a  rela¬ 
tive  invariant  R  by  defining 

R(u)  -  M  (ur).  (9) 

P 

We  see  that  R(ga)  •  M  ( (gu»))r)  »  M  (guT)  » 

r  —  0  ® 

p (g)MD (u  )  •  p(g)R(u),  as  desired.  Ue  obtain  the 
corresponding  result  for  invariants  in  the  special 
case  in  which  o  is  the  trival  representation, 

P(»>«  V 


Recall  that  if  R:  ft-*V  is  relatively  invariant 

with  modulus  p,  then  we  may  write  uC(x)  ■  o(x  1)R(u) 
for  all  ueft,  xeG.  Thus,  we  have  for  each  compact 

subset  K  of  G,  /Ko(x)uir(x)du(x)  ■  /KR(w)du(x)  » 

u(K)R(u).  Comparison  with  (7)  shows  that  Mo(wr) 
exists  and  is  equal  to  R(ui).  This  shows  that  each 

ii'eKV.p)  and  shows  as  well  the  identity  M  (<ur)  » 
R(w) .  We  have  thus  shown:  0 

Theorem  4.  If  R:  ft-*V  Is  such  that  each 

»rcL(V,p)1  then  R(u)  »  M0(ur)  defines  a  relative 
invariant  R  with  modulus  p.  Conversely,  every  re¬ 
lative  invariant  is  precisely  of  this  form,  since 
if  R  is  relatively  invariant  with  modulus  a ,  then 

each  urcL(V,p)  and  R  »  R. 

The  above  may  be  paraphrased  by  saying  that 
the  construction  of  relative  invariants  with  a 
given  modulus  o  is  equivalent  to  the  construction 

of  a  representation  u->ur  of  ft  on  V  such  that  each 

urcL(V,o).  Observe  that,  in  particular,  we  have 
shown  in  a  strict  sense  that  every  relative  invar¬ 
iant  is  a  weighted  average  over  the  entire  group  G. 

The  result  in  Theorem  4  gives  valuable  insight 
to  the  nature  of  invariants  and  relative  Invariants. 
Nevertheless,  it  is  less  than  satisfying  in  cer- 
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cain  wavs.  In  che  firsc  place,  ic  gives  no  clue  as 
Co  how  co  construct  a  suicable  R,  although  it  can 
certainly  eliminate  a  number  of  choices.  Conse¬ 
quently,  it  is  noc  a  true  existence  theorem  In  the 
sense  that  for  a  given  application  it  does  not  ac¬ 
tually  produce  an  invariant.  Moreover,  there  are 
many  examples  of  invariants  which  occur  in  natural 
ways  but  are  not  presented  in  che  form  given  above 
(although  .they  are  necessarily  equivalent  co  such 
a  form) . 

5.  Existence  of  Invariants 

The  consideration  of  the  group  average  in  che 
preceding  section  led  co  the  existence  of  relative 
invariants  and  is  applicable  in  any  situation  where 

each  ur  belongs  co  a  class  of  functions  for  which 
such  an  average  exists.  This  is  the  case,  for  ex¬ 
ample,  when  the  class  of  functions  is  almost  peri¬ 
odic  (in  che  sense  of  J.  von  Neumann  [9]).  It  is, 
however,  applicable  in  a  wider  variety  of  cases, 
namely  those  in  which  sec  B(G)  of  bounded  real 
valued  functions  admits  an  invariant  mean,  in  che 
following  sense: 

Def lnicion:  An  invariant  mean  M  on  the  class 
3(G)  of, bounded  real  valued  functions  on  G  is  a 
real  linear  functional  M  on  B(G)  which  is  invariant 
under  the  action  of  G  on  3(G)  and  satisfies 

inf  f  <  M(f)  <  sup  f,  ftB(G).  (10) 

Let  V*  denote  the  dual  space  of  V  and  let 
3(G,V)  denoce  the  bounded  functions  from  G  to  V. 

For  each  £  t 3(C,V)  and  for  each  v*  e  V*  we  observe 
that  v*  o  :  s  3(G) . 

lemma  4.  Let  MqS(3(G))*,  che  dual  of  3(G). 

There  exists  a  unique  linear  transformation 
M:  3(G,V)  -  V  such  chat 

v*  o  M  *  Mq  o  v*  (11) 

for  all  v*  c  V* . 

Proof:  It  is  clear  chat  any  M  satisfying  (11) 

is  unicue.  Let  v  ,v ,,..., v  be  a  basis  in  V  and 
1  «  n 

let  v*,v*  . . .  ,v*  be  a  dual  basis  in  V*.  so  chat 

i2  n 

<v*,v  >  *  i  .  .  Let  us  define  M:  B(C,V)-V  by 
L  j  i  2 

n 

M(f)  -  >  M(v*of)v  f  c  3(G,  V) .  (12) 

i-1  U  1 

Than  for  any  v* s  V*  we  have  v*  o  M(f)  ■  <v*,M(f)>  * 
n  n 

•  <v*,v  »M-(v*  of)*  M.(  l  <v*,v  »v*  of)* 
i*l  1  0  1  0  i-1  11 

J!-(v*  o  f) .  That  v*  o  M  *  M  o  v*,  as  desired. 

0  0 

Lee  us  observe  chat  for  any  linear  transforma¬ 
tion  A:  V-V  and  ftB(G,V)  we  may  define  che  compo¬ 
site  Af  so  that  3(G,V)  may  be  considered  as  a  (left) 
module  over  p  ring  Lin(V,V)  of  linear  transforma¬ 
tions  on  V.  ich  this  in  mind,  we  observe: 

Lemma  5  ■  The  linear  nap  M:  B(G,V)-*V  defined 
by  (11)  above  is  a  morphism  of  3(G,V)  to  V  consi¬ 
dered  as  modules  over  Lin(V,V). 


Proof:  For  any  Aclin(V,V),  v*  t  V*  and 
ft  B(C,V),  we  have  v* o M(aF)  *  Mg(v*oAF)  ■ 

Mq( (A*v*)o  f)  »  A*v*  o  M(f)  *  v*  o  AM(f).  Hence, 
M(AF)  »  AM(f),  completing  the  proof. 

Lemma  6:  If  e  (B(G))*  is  invariant  under  C 
then  so  is  the  map  M  as  defined  by  (11). 

Proof:  Mq  invariant  Implies  chat  for  any 
fQ c  B(G)  and  f t G,  we  have  Mg(gf0)  ■  M0(fQ).  Thus, 

if  v*  e  V*,  f  c  B(G,V)  then  for  any  g  e  C  we  have 
v*  o  M(gf )  »  Mq(v*  of)*  MQ(g(v*  of)*  Mg(v*  o  f) 

•  v*  o  M( f ) .  Then  by  Lemma  4,  M(gf)  *  M(f),  as 
desired. 

We  have  thus  shown  how  to  "lift"  invariant 
linear  functionals  on  B(G)  to  invariant  linear  maps 
from  B(C,V)  to  V. 

Corollary:  If  G  admits  an  invariant  mean 

and  R:  G-*V  is  such  that  each  «r  e  B(G,V) ,  then 

R(w)  -  M(uir)  (13) 

defines  an  invariant  measurement,  where  M  is  given 
by  (11). 

Proof:  R(gu)  »  M((gw)f)  *  M(guC)  *  M(wr)  * 

R(w). 

We  may  obtain  relative  invariants  in  a  similar 
fashion.  However,  to  remain  within  che  bounded 
functions,  we  restrict  our  attention  to  unitary  re¬ 
presentations  of  G. 

Let  us  suppose  chat  is  a  given  invariant 

mean  on  the  class  of  bounded  function  on  G  and  that 
M  is  the  lifted  map  defined  by  (11)  above.  Also, 
let  o  be  a  given  unitary  representation  of  G  in 
GL(V).  Observe  then  chat  for  each  ft3(G,V)  we 
have  also  ofeB(G,V),  where  (pf)(g)  *  o(g)g(g), 
gcG.  Now,  let  R:  SI -V  be  a  given  measurement 
function  such  chat  each  ojrtB(G,V).  Since  this 
simply  means  chat  the  values  of  R  on  che  orbit  of 
u  are  bounded,  this  is  noc  deemed  to  be  a  serious 
restriction. 

With  this  in  mind,  let  us  note  that  c(gu)‘  * 
B(g>g(cujr)  for  all  gcG,  wei!.  To  see  this,  we 
have,  at  any  xcG,  (p(gur](x)  *  p  (x)  (gw7)  (x)  - 

o(x)ur(g  1x)  *  c  (g)c  (g  1x)wr(g  Sc)  *  o(g)[g(Pw7)] 
(x).  Also,  let  us  observe  that,  for  fixed  g, 
o(g)  t  Lin (V , V)  and  that  M  is  a  morphism  of  Lin(V,7)- 
modules.  We  now  define  R:  il-*V  by  the  formula 

R(w)  *  M(pw7)  ,  ticQ.  (14) 

Recalling  the  invariance  of  M,  and  che  facts  above, 

we  see  that  for  g t  G,  we  have  R(gu)  "  M(p(gs.7))  * 

M(o(g)g(cw7))  *  o (g)M(g(pu7) )  *  o  (g)M(ou,7)  » 

P (g)R(u) . 

That  Is,  S  is  a  relative  invariant  and  has  the 
given  representation  o  as  its  modulus.  We  have. 
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therefore,  proved  the  following  remarkable  result: 


Theorem  5,.  If  B(C)  admits  an  invariant  mean 
Mg.  o  la  any  unitary  raprasantatlon  of  G  in  CL(V) , 

and  a  non-trivial  bounded  measurement  function 
R:  n- V  exist*,  then  there  exists  a  non-trlval  rel¬ 
ative  invariant  R:  V  with  modulus  p.  I  is  given 

Implicitly  by 

R<«)  -  M(p«ir)  ,  (15) 

where  M  is  the  lift  of  to  B(G,V). 

The  appearance  of  the  words  non-trlval  in  the 
above  requires  slight  explanation.  We  can  clearly 
define  M:  B(G,V)-V  by  M(f)  »  M(pf)  and  deduce  that 
M(gf)  ■  P (g)M(f ) .  The  fact  that  M  t  0  gives  M  )  0. 

Since  R(u)  •  M(ur) we  sec  the  sense  in  which  R  is 
non-trivial,  l.e..  it  is  the  restriction  of  M  to 

the  functions  Qr  ■  {wr|wcfl}.  Nevertheless,  it 
could  happen  that  each  pur  is  annihilated  by  M  so 
that  I  5  0  even  though  R  I  0.  This  is  unlikely  and 
can  be  Ignored  if,  for  Instance,  we  have  some 

f  — 

o<u  >0,  since  for  such  sen  we  see  chat  R(u)  >0. 

6.  Susaarv  and  Suggestions  for  Further  Research 

We  have  shown  that  every  set  of  patterns  sub¬ 
ject  to  a  transformation  group  ia  representable  as 
functions  defined  on  the  group  and  chat  such  repre¬ 
sentations  are  Implicit  in  the  measurement  process. 
It  has  also  been  shown  chat  every  relaclve  invariant 
is  equivalent  to  a  weighted  average  of  a  measurement 
on  the  patterns  taken  over  the  relevant  group  of 
transformations.  Moreover,  Che  existence  of  suit¬ 
ably  many  relative  invariants  have  been  demonstrated 
in  any  situation  in  which  measurements  are  bounded 
and  the  group  admits  an  invariant  mean. 

Several  avenues  for  further  research  may  be 
suggested.  Application  of  group  theory  to  template 
matching  is  a  possibility  which  should  be  explored. 
The  necessary  computational  methods  are  by  no  means 
trivial,  even  in  the  simplest  of  cases  (e.g.,  com¬ 
pact  groups,  one-parameter  groups,  finite  groups). 
Also,  we  have  totally  ignored  noise  related  ques¬ 
tions.  The  impact  of  noise  models  on  the  use  of 
invariants  and  relative  invariants  should  be  inves¬ 
tigated  rigorously.  Experimental  results  reported 
to  dace,  for  example  in  [1],  [S]  and  [IB],  indicate 
chat  noise  perturbations  may  be  small  in  comparison 
with  the  deterministic  factors  Involved  in  groups 
of  transformations.  However,  definitive  results 
are  very  meager. 

Another  area  for.  possible  investigation  in¬ 
volves  the  use  of  group  theoretic  methods  in  search 
of  Imbedded  subpatterns.  Since  this  evidently  in¬ 
volves  local  features,  it  follows  chat  invariance 
holds  little  promise  as  a  cool.  Indeed,  by  Theorem 
4 ,  invariant  features  are  necessarily  global  in  na¬ 
ture.  Some  hope  remains,  however,  in  the  case  in 
which  features  may  be  represented  as  analytic  func¬ 
tions,  since  global  information  may  be  obtained  by 
local  measurement  due  to  the  existence  of  pover 
series  en-anslons. 

Finally,  since  a  large  number  of  groups  ap¬ 


pearing  in  applications  admit  continuous  parareters, 
the  use  of  control  theory  in  pattern  matching  is 
suggested.  Problems  which  involve  patterns  in  con¬ 
tinuous  motion  can  be  modelled  In  such  an  environ¬ 
ment  and  a  group  theoretic  approach  should  be  quite 
fruitful  in  such  cases. 


References 


[1]  F.  Alt,  Digital  Recognition  by  Momencs;  in 
Optical  Character  Recognition.  Washington,  D.C., 
Spsrtan,  1962,  pp.  152-179. 

[2]  E.  Cassirer,  The  Concept  of  Group  and  the  Theory 
of  Perception,  Philosophy  and  Phenomenological 
Research,  Vol.  V,  1944,  pp.  1-35. 

[3]  P.  M.  Cohn,  Lie  Groups.  Cambridge  University 
Press,  1957. 

[4]  H.  Dlrllten,  Ph.D.  Dissertation,  Texas  Tech 
University,  1974. 

[5]  H.  Dlrllten,  and  T.  G.  Newman,  Pattern  Matching 
Under  Affine  Transformations,  IEEE  Trans.  Comp., 
Vol.  C-26,  pp.  314-317,  1977. 

[6]  R.  Duda,  and  P.  Hart,  Pattern  Classification 
and  Scene  Analysis.  New  York,  John  Wiley  and 
Sons,  1973. 

[7]  J.  C.  Dunn,  Continuous  Group  Averaging  and 
Pattern  Classification  Problems,  SIAM  J. 

Comput.,  Vol.  2,  1973,  pp.  253-272. 

[8]  J.  C.  Dunn,  Group  Averaged  Linear  Transforms 
that  Detect  Corners  and  Edges,  IEEE  Trans. 

Comp.,  Vol.  C-24 ,  1975,  pp.  1191-1201. 

[9]  E.  Hewitt,  and  K.  Ross,  Abstract  Harmonic  Anal¬ 
ysis  I,  New  York,  Academic  Press  Inc.,  1963. 

[10]  W.  C.  Hoffman,  The  Lie  Algebra  of  Visual  Per¬ 
ception,  J.  Math  Psych.,  Vol.  3,  1966,  pp.  65- 
98. 

[11]  W.  C.  Hoffman,  The  Neuron  as  a  Lie  Group  Germ 
and  a  He  Product,  Quart.  Appl.  Math.,  Vol.  25, 
1968,  pp.  423-440. 

[12]  M.  K.  Hu,  Visual  Recognition  by  Moment  Invar¬ 
iants,  IRE  Trans.  Inform.  Thy.,  Vol.  IT-8, 

1952,  pp.  179-187. 

[13]  R.  B.  McGee,  Automatic  Recognition  of  Complex 
Three-Dimensional  Objects  from  Optical  Images, 
Report  AFOSR-TR-0090  under  contract  AF0SR-71- 
2048,  National  Technical  Information  Service, 
Oct.  1973. 

[14]  L.  Nachbin,  The  Haar  Integral.  Princeton,  N.J., 
Van  Nostrant  Inc.,  1965. 

[15]  G.  Nagy,  State  of  the  Art  in  Pattern  Recogni¬ 
tion,  Proc.  IEEE,  Vol.  26,  1968,  pp.  836-862. 


165 


[161  Pitts  and  W.  S.  McCulloch,  How  we  know 

Universals  -  The  Perception  of  Auditory  and 
Visual  Forms,  Bull.  Math.  Biophysics,  Vol.  9, 
1947,  pp.  127-147. 

[17]  J.  M.  Richardson,  Pattern  Recognition  and 
Group  Theory,  in  Frontiers  of  Pattern  Recog¬ 
nition.  Hew  York,  Academic  Press,  1972,  pp. 
453-477. 

[18]  A.  D.  Van  der  Luge,  Signal  Detection  by  Com¬ 
plex  Spatial  Filtering,  IEEE  Trans.  Info. 
Thy.,  Vol.  IT-10,  1964. 

[19]  H.  Weyl,  The  Classical  Groups,  Princeton 
University  Press,  Princeton,  N.J.,  1946. 

[20]  E.  Kong,  and  J.  A.  Steppe,  Invariant  Recog¬ 
nition  of  Geometric  Shapes,  in  Methodologies 
in  Pattern  Recognition.  New  York. 


166 


1 


9.  Abstract  of  "An  Inverse  Problem  Related  to  Video  Tracking",  by 
T.G.  Newman 

Consider  a  time  varying  two-dimensional  image  in  which  objects  are  in 
motion  along  trajectories  arising  from  horizontal  and  vertical  translations, 
magnification  and  rotation.  For  such  Images  a  first  order  linear  P.D.E. 
holds,  provided  the  images  are  modeled  as  functions  F(t,x,y)  of  time  t 
and  the  spatial  variables  x  and  y.  In  this  equation  the  (unknown) 
parameters  which  determine  the  motion  also  appear  linearly.  Evaluation  on  a 
grid  produces  a  system  of  linear  equations  which  may  be  solved  for  the 
trajectory  parameters. 

In  practice,  the  evaluation  proceeds  by  numerical  approximation  of  the 
required  partial  derivatives.  In  view  of  the  ill -posed  nature  of  numerical 
differentiation,  inherent  noise  and  sampling  truncation  present  great 
difficulties. 

Although  no  elegant  solutions  are  at  hand,  examples  are  given  to  show 
the  effect  of  somewhat  naive  methods  of  solution  on  real  data. 
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10.  Abstract  of  "Lie  Groups  and  Lie  Algebras  in  Video  Tracking",  by 
T.G.  Newman 

Motion  of  objects  in  time-varying  images  can  sometimes  be  described  by 
the  action  of  a  group  of  transformations  on  the  image  plane,  regarded  as  a 
manifold.  Moreover,  the  transformation  groups  occurring  in  applications  can 
generally  be  described  analytically  in  terms  of  a  finite  number  of  parameters; 
that  is  to  say,  they  are  Lie  groups.  In  this  situation  we  show  that  that 
data  satisfies  a  linear  partial  differential  equation  in  which  the  parameters 
of  motion  appear  as  linear  coefficients.  More  or  less  standard  numerical 
methods  permit  these  parameters  to  be  determined. 

The  parameters  of  motion  determined  as  indicated  above  may  be  regarded 
as  a  velocity  profile.  This  profile  has  the  useful  property  of  being 
spatially  constant  for  each  moving  object  in  the  image.  In  principle,  at 
least,  this  permits  detection  and  tracking  of  various  objects  having  different 
trajectori es. 

Following  development  of  the  appropriate  theory,  the  paper  concludes  by 
presenting  the  results  of  applying  the  technique  to  a  number  of  real  images 
in  the  form  of  digitized  video. 
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11.  Abstract  of  "Results  in  Differential  Geometry  with  Application  to 
Video  Tracking",  by  G.A.  Fredricks  and  T.G.  Newman 

We  may  begin  many  investigations  in  pattern  recognition  by  assuming  that 
a  pattern  is  represented  by  a  map  on  a  smooth  manifold  and  that  the  action 
of  a  transformation  group  on  the  set  of  patterns  is  represented  by  trans¬ 
lation  on  the  associated  maps.  Such  an  approach  has  recently  been  found  to 
be  of  value  in  image  processing  as  well.  In  this  paper  we  present  some 
theoretical  results  concerning  the  interplay  between  various  vector  fields 
which  arise  from  the  action  of  a  Lie  group  on  a  smooth  manifold.  We  further 
indicate  how  these  results  may  be  interpreted  in  the  analysis  of  video  data, 
permitting  a  new  approach  to  target  tracking. 


12.  Abstract  of  "Lie  Theoretic  Methods  in  Video  Tracking",  by  T.G.  Newman 
and  D.A.  Demus 


Consider  a  2-dimensional  image  in  which  objects  are  in  motion  through 
trajectories  describable  by  translation  (both  horizontal  and  vertical), 
rotation,  and  magnification.  The  trajectory  of  such  an  object  can  be  com¬ 
pletely  described  by  a  4-vector  of  parameters  X  (t)  =  (x1 ,x2,x3,x4)  which 
determine  the  velocities  with  respect  to  the  four  possible  motions.  If 
the  data  at  time  t  and  position  x  in  the  view  plane  is  written  as  F(t,x), 
then  we  can  show  that 


if. 

at 
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where  X-j ,  *3  anc*  *4  are  certain  (known)  differential  operators  associated 

with  the  group  of  motions. 

The  derivatives  appearing  above  may  be  evaluated  numerically  at  various 
points  in  a  given  time  slice  to  produce  a  system  of  linear  equations  which 
may  be  solved  for  the  motion  parameters.  Evaluation  at  points  within  a 
moving  rigid  body  leads  to  a  vector  of  motion  parameters  unique  to  that 
particular  body.  In  principle,  at  least,  this  technique  permits  application 
of  tracking  as  well  as  segmentation  of  images  based  on  relative  motion  of 
various  objects. 

The  paper  concludes  by  presenting  the  results  of  having  implemented  the 
above  method  on  digitized  video  images. 
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